HDGCvar
Granger causality testing in High Dimensional Vector Autoregressive Models
Install / Use
/learn @Marga8/HDGCvarREADME
HDGCvar
<!-- badges: start --> <!-- badges: end -->HDGCvar allows for testing Granger causality in High Dimensional
Vector Autoregressive Models (VARs). Granger causality can be tested
between time series that are stationary (HDGC_VAR_I0), non stationary
(unit root), cointegrated, or all the above (HDGC_VAR). Bivariate as
well as multivariate (i.e. blocks) causality can be considered by
specifying the name(s) of the variable(s) of interest in GCpair (or
GCpairs) and networks can be plotted to visualize the causal structure
among several variables using Plot_GC_all.
A specific part of HDGCvar is dedicated to Realized Volatilities (RV),
thus using the Heterogeneous VAR (HVARs). It gives the possibiity of
building RV spillover networks (HDGC_HVAR_all) as well as conditioning
RV on Realized Correlations (HDGC_HVAR_RV_RCoV_all).
Installation
You can install the released version of HDGCvar from CRAN with:
install.packages("HDGCvar")
And the development version from GitHub with:
# install.packages("devtools")
devtools::install_github("Marga8/HDGCvar")
All the functions in HDGCvar are based on the following two papers:
- A. Hecq, L. Margaritella, S.Smeekes, “Granger Causality Testing in High Dimensional VARs: a Post Double Selection Procedure”(2019, https://arxiv.org/abs/1902.10991 )
- A. Hecq, L. Margaritella, S.Smeekes, “Inference in Non Stationary High Dimensional VARs” (2020, check the latest version at https://sites.google.com/view/luca-margaritella ).
Details
If the addition of variable X to the given information set Ω alters the conditional distribution of another variable Y, and both X and Ω are observed prior to Y, then X improves predictability of Y, and is said to Granger cause Y with respect to Ω.
Statistically assessing the predictability among two (or blocks of) time series turns out to be a fundamental concept in modern time series analysis. Its applications range over from macroeconomics, finance, network theory and even the neurosciences.
With the increased availability of larger datasets, these causality concepts have been extended to the high-dimensional setting where they can benefit from the inclusion of many more series within the available information set Ω. Note how including as much information as possible is particularly tending to the original idea of Granger (1969) who envisioned Ω to be “all the information in the universe”, in order to avoid spurious causality.
Conditioning on so many variables comes at a cost, i.e. quickly running into problems of high-dimensionality: sky-rocketing variance, overfitting and in general invalidity of standard statistical techniques.
A. Hecq, L. Margaritella, S.Smeekes (2019), designed a Granger causality LM test for high-dimensional stationary vector autoregressive models (VAR) which combine dimensionality reduction techniques based on penalized regressions such as the lasso of Tibshirani (1996), with the post-double-selection procedure of Belloni et al. (2011) to select the set of relevant covariates in the system. The double selection step allows to substantially reduce the omitted variable bias and thereby allowing for valid post-selection inference on the parameters.
If you are sure that your time series are stationary, you can go ahead
and use one of the following functions from HDGCvar:
HDGC_VAR_I0which tests for Granger causality in High Dimensional Stationary VARsHDGC_VAR_multiple_I0orHDGC_VAR_multiple_pairs_I0which test multiple combinations Granger causality in High Dimensional Stationary VARsHDGC_VAR_all_I0which allows you to test all the bivariate combinations in you dataset and hence to build a Granger causality network.
All these functions ask you as inputs:
GCpairorGCpairs, a list object explicitly containing the Granger caused variables “GCto” and the Granger causing “GCfrom”.data, the dataset containing as columns all the time series you are modeling.p = 1, the lag-length of the VAR. Unless you want to explicitly impose it, you can useHDGCvar::lags_upbound_BIC()which estimates an empirical upper bound for the lag-length using Bayesian Information Criteria (for details see Hecq et al. 2019).bound = 0.5 * nrow(data), this is meant for applying a lower bound on the penalty parameter of the lasso. In simple words it means you are telling the lasso, in each equations it estimates, to please stop when it has reached an amount of selected (i.e. non-zero coefficient) variables corresponding to (50%) of the sample size. This is innocuous in small systems but quite important for systems where the number of variables per-equation (i.e. lags included) is larger than the sample size (for details see Hecq et al. 2019).parallel = FALSE, put equal to TRUE for parallel computing.n_cores = NULL, number of cores to use in parallel computing, default is all but one.
A. Hecq, L. Margaritella, S.Smeekes (2020) extended the Post-double-selection Granger causality LM test of Hecq et al. (2019) to the case in which your system might contain time series integrated of arbitrary orders and possibly even cointegrated. To accomplish this is necessary to simply augment the lags of the variables of interest by the maximum order of integration we suspect the series having. This allows to completely avoid any biased pre-test of unit-root or cointegration and directly perform high-dimensional inference on the desired parameters.
Therefore: if you are NOT sure whether your time series are stationary,
non-stationary, cointegrated, or an appropriate mix of the above or you
do not trust the biased unit root and cointegration tests out there, we
got you covered! you can go ahead and use one of the following
functions from HDGCvar:
HDGC_VARwhich tests for Granger causality in High Dimensional Stationary/Non-Stationary/cointegrated or a mix of the above, VARsHDGC_VAR_multipleorHDGC_VAR_multiple_pairswhich test multiple combinations Granger causality in High Dimensional Stationary/Non-Stationary/cointegrated or a mix of the above, VARsHDGC_VAR_allwhich allows you to test all the bivariate combinations in you dataset and hence to build a Granger causality network.
All these functions ask you the same inputs as reported above for
HDGC_VAR_I0 with the addition of the following:
d=2, is the augmentation needed to handle the potentially non-stationary time series. It should correspond to the maximum suspected order of integration of your time series. This might sound vague but think about it: in economics you will hardly see a variable integrated of order two or three. Therefore, unless you are absolutely sure that your time series are at most integrated of order one and those not integrated have eigenvalues far from the unity or that for whatever unlikely reason you have at least one variable integrated of order three, I would recommend to putd=2. Be careful though: if the lag-lengthpthat you are using is smaller or equal tod,HDGC_VARwill give you a warning: to avoid spurious regression issues in the post-double-selection step you need to ensurep>=d+1. This ulimnately means that if you want to be safeguarded from possible I(2) variables you will have to putd=2and at leastp=3. In other words, even thoughHDGCvar::lags_upbound_BIC()might have estimatedp=2, you still want to make it larger to avoid spurious results in the lasso selection.
Back to A. Hecq, L. Margaritella, S.Smeekes (2019), in their empirical application they considered a specific case of stationary time series, namely they investigate the volatility transmission in stock return prices using the daily realized variances. These are very attractive measures among practitioners and academics for modelling time varying volatilities and monitoring financial risk. Given the time series of realized volatilities, a multivariate version of the heterogeneous autoregressive model (HVAR) of Corsi (2009) is employed to model their joint behavior.
Therefore: if you are interested in testing Granger causality among
realized volatilities and potentially build volatility spillover
networks you should use one of the following function from HDGCvar:
HDGC_HVARwhich tests for Granger causality in High Dimensional stationary HVARs.HDGC_HVAR_multipleorHDGC_VAR_multiple_pairswhich test multiple combinations Granger causality in High Dimensional stationary HVARs.HDGC_HVAR_allwhich allows you to test all the bivariate combinations in you dataset and hence to build a Granger causality network.
All these functions ask you the same inputs as reported above for
HDGC_VAR_I0 with the inclusion of log=TRUE to log-transform the
series (recommended) and the exclusion of the lag-length parameter p
as in a HVAR this is already pre-specified to be equal to 3, namely the
daily, weekly, monthly aggregates of realized volatilities (for details
see Corsi (2009) and Hecq et al. (2019)).
Note: to account for the potential heteroskedasticity in the data, all these functions returns the Asymptotic χ(^2) version of the LM test (“Asymp”), the small sample correction F tests (“FS_cor”) and the
Related Skills
node-connect
350.8kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
frontend-design
110.4kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
openai-whisper-api
350.8kTranscribe audio via OpenAI Audio Transcriptions API (Whisper).
qqbot-media
350.8kQQBot 富媒体收发能力。使用 <qqmedia> 标签,系统根据文件扩展名自动识别类型(图片/语音/视频/文件)。
