Projects

My research can be described along several projects or themes which group my papers together (there is obviously overlap). Below these are described in detail.

Inference for High-Dimensional Econometric Time Series

Description

In this project, funded by an NWO Vidi grant, we develop econometric methods for inference on high-dimensional time series data. Nowadays large and complex datasets are available in economics that allow for deep analysis of large amounts of information relevant to understand economic developments. Worldwide panels of macroeconomic data, large disaggregate datasets on country-level, high-frequency financial data and series derived from electronic sources such as Google Trends or social media all provide opportunities for improving economic forecasting and policy analysis. Such datasets typically contain large numbers of potential predictors, requiring statistical methods specifically designed for high-dimensional models with many parameters to estimate. While development of such techniques has been flourishing in the field of statistical learning, most existing methods are not geared towards econometric time series and therefore unsuitable for datasets such as described above.

We develop penalized regression methods for estimation, prediction and, in particular, uncertainty quantification for high-dimensional time series, allowing for complex dependencies, persistency and trends as typically observed in economic time series. We also develop honest methods of inference, which explicitly take uncertainty arising from not knowing the true model into account when conducting inference. This avoids underestimating the true estimation (and model) uncertainty that occurs when model or variable selection is ignored. In addition, bootstrap methods are developed for uncertainty quantification in high-dimensional time series analysis, which are not only justified theoretically, but also provide accurate and reliable inference in practice. The practical suitability of these methods is demonstrated by applying them to high-dimensional economic datasets. By developing methods that allow for accurate and reliable statistical analysis of complex high-dimensional econometric time series, this project contributes to the tools at the disposal of the modern empirical economist, resulting in improved economic forecasting, policy analysis and general understanding of economic dynamics.

People involved

The PhD projects of Luca Margaritella (joint supervision with Alain Hecq) and Robert Adamek (joint supervision with Ines Wilms) deal with these problems. I also collaborate with Etienne Wijler on this topic.

Core Papers

Friedrich, M., L. Margaritella and S. Smeekes (2023) High-dimensional causality for climatic attribution. arXiv e-print 2302.03996.
Hecq, A., L. Margaritella and S. Smeekes (2023). Inference in non-stationary high-dimensional VARs. arXiv e-print 2302.01433.
Adamek, R., S. Smeekes and I. Wilms (2023). Sparse high-dimensional vector autoregressive bootstrap. arXiv e-print 2302.01233.
Adamek, R., S. Smeekes and I. Wilms (2022). Local projection inference in high dimensions. arXiv e-print 2209.03218.Adamek, R., S. Smeekes and I. Wilms (2022). Lasso inference for high-dimensional time series. Journal of Econometrics, forthcoming.
Hecq, A., L. Margaritella and S. Smeekes (2021). Granger causality testing in high-dimensional VARs: a post-double-selection procedure. Journal of Financial Econometrics, forthcoming.
Smeekes, S. and E. Wijler (2021). An automated approach towards sparse single-equation cointegration modelling. Journal of Econometrics 221 (1), 247-276.
Smeekes, S. and E. Wijler (2020). Unit Roots and Cointegration. In P. Fuleky (Ed.), Macroeconomic Forecasting in the Era of Big Data, Chapter 17, pp. 541-584. Advanced Studies in Theoretical and Applied Econometrics, vol. 52. Springer.
Smeekes, S and I. Wilms (2020). bootUR: An R Package for Bootstrap Unit Root Tests. arXiv e-print 2007.12249.
Smeekes, S. and E. Wijler (2018). Macroeconomic forecasting using penalized regression methods. International Journal of Forecasting 34 (3), 408-430.

Software

The R package bootUR implements several bootstrap unit root tests for (potentially) high-dimensional systems of time series - joint work with Ines Wilms.
The R package specs implements the Single-Equation Penalized Error Correction Selector (SPECS) - joint work with Etienne Wijler.
The R package desla allows for inference in high-dimensional time series models via the desparsified lasso - joint work with Robert Adamek and Ines Wilms.
The R package HDGCvar allows for testing Granger causality in high-dimensional vector autoregressive models - joint work with Alain Hecq and Luca Margaritella.

Inference on Trends in Atmospheric Time Series

Description

In this research area we investigate how to perform inference on trends, and breaks in trends, in atmospheric time series, such as on the number of particles of atmospheric ethane observed in the atmosphere. Tools developed in the other, more theoretical projects, are used to investigate these issues.

People involved

The PhD project of Marina Friedrich deals with many of these issues, and we also collaborate with researchers from the GIRPAS group at the University of Liege led by Emmanuel Mahieu.

Related papers

Friedrich, M., S. Smeekes and J.-P. Urbain (2020). Autoregressive Wild Bootstrap Inference for Nonparametric Trends. Journal of Econometrics, 214 (1), 81-109.
Friedrich, M., E. Beutner, H. Reuvers, S. Smeekes, J.-P. Urbain, W. Bader, B. Franco, B. Lejeune and E. Mahieu (2020). A statistical analysis of time trends in atmospheric ethane. Climatic Change, 162 (1), 105-125.

Big Data as a Data Source for Official Statistics

Description

In this project, we investigate how Big Data can be used as a novel data source for Official Statistics. National statistical institutes traditionally draw samples from the target population for the production of official statistics on that population, think for example about unemployment figures for the Dutch population. While this widely accepted approach is based on a sound mathematical theory, it faces several problems. The surveys needed to collect the data are expensive and response rates are declining. Moreover, when people refuse to participate in these surveys, selection bias can occur, especially as people with certain characteristics are less likely to participate, or even be reached, than others. Finally, people may not always tell the truth (even unintentionally) in such surveys, causing measurement errors. This drives statistical institutes to look for alternative sources of data, often generated as a by-product of processes not directly related to statistical production purposes, to assist in the production of official statistics. Examples of these include time and location of network activity available from mobile phone companies, social media messages from Twitter and Facebook and internet search behaviour from Google Trends. Using such data causes several methodological issues. First, such data are typically not representative of the full target population and therefore contain selection bias. Second, combining and analysing large unstructured datasets requires statistical learning methods that can filter relevant from irrelevant information. Third, it is unclear how sensible uncertainty measures can be obtained to quantify estimation and model uncertainty as well as measurement error and selection bias. All three issues are investigated in this research area.

People involved

The PhD project of Caterina Schiavoni, who I jointly supervise with Jan van den Brakel and Franz Palm, is devoted to this research and funded by Statistics Netherlands.

Related papers

Schiavoni, C., F.C. Palm, S. Smeekes and J. van den Brakel (2021). A dynamic factor model approach to incorporate Big Data in state space models for official statistics. Journal of the Royal Statistical Society - Series A, 184 (1), 324-353.
Schiavoni, C., S.J. Koopman, F.C. Palm, S. Smeekes and J. van den Brakel (2021). Time-varying state correlations in state space models and their estimation via indirect inference. Tinbergen Institute Discussion Paper 2021-020/III.

Bootstrap Inference for Risk Measures

Description

In this project we develop and theoretically validate prediction intervals for risk measures such as Value-at-Risk and Expected Shortfall. These risk measures play a key role in recent financial legislation when it comes to determining capital requirements. Predictions for risk measures are subject to data and estimation uncertainty. To incorporate this uncertainty into the statistical analysis, we propose to construct prediction intervals by means of the bootstrap. The published simulation results are promising. Yet, there is no theoretical result in the literature underpinning the validity of this method. We aim to fill this gap.

People involved

This research forms the PhD project of Alexander Heinemann, who I jointly supervised with Eric Beutner and Franz Palm, and was funded by an NWO Research Talent grant.

Related papers

Beutner, E., A. Heinemann and S. Smeekes (2021). A justification of conditional confidence intervals. Electronic Journal of Statistics 15 (1), 2517-2565.
Beutner, E., A. Heinemann and S. Smeekes (2019). A general framework for prediction in time series models. arXiv e-print 1902.01622.
Beutner, E., A. Heinemann and S. Smeekes (2018). A residual bootstrap for conditional Value-at-Risk. arXiv e-print 1808.09125.

Bootstrap Methods for Time-Varying Processes

Description

The properties of many economic time series vary over time, as our economy is continually evolving. Reliable inference on time-varying processes is therefore of paramount importance. The objective of this project, funded by an NWO Veni grant, was to develop bootstrap methods that allow for reliable inference on such time-varying processes. The bootstrap has advantages over standard asymptotic inference: it is often more accurate in properly specified models, and more robust in mis-specified models. In non-standard settings with time-varying processes asymptotic inference is often not accurate, and may not even be available, so the bootstrap is a natural alternative. However, traditional bootstrap methods are also not valid in such non-standard settings. We developed new bootstrap approaches for inference on common stochastic trends (cointegration), such as found in macro-economics; deterministic trends, important in economics of growth and climatology, and financial time series displaying time-varying volatility.

Related papers

Friedrich, M., S. Smeekes and J.-P. Urbain (2020). Autoregressive Wild Bootstrap Inference for Nonparametric Trends. Journal of Econometrics, 214 (1), 81-109.
Smeekes, S. and J. Westerlund (2019). Robust Block Bootstrap Panel Predictability Tests. Econometric Reviews 38 (9), 1089-1107.
Hurlin, C., S. Laurent, R. Quaedvlieg and S. Smeekes (2017). Risk Measure Inference. Journal of Business and Economic Statistics 35 (4), 499-512.
Götz, T.B., A. Hecq and S. Smeekes (2016). Testing for Granger Causality in Large Mixed-Frequency VARs. Journal of Econometrics 193 (2), 418-432.
Smeekes, S. (2015). Bootstrap Sequential Tests to Determine the Order of Integration of Individual Units in a Time Series Panel. Journal of Time Series Analysis 36 (3), 398-415.
Cavaliere, G., P.C.B. Phillips, S. Smeekes and A.M.R. Taylor (2015). Lag Length Selection for Unit Root Tests in the Presence of Nonstationary Volatility. Econometric Reviews 34 (4), 512-536.
Smeekes, S. and J.-P. Urbain (2014). A Multivariate Invariance Principle for Modified Wild Bootstrap Methods with an Application to Unit Root Testing. GSBE Research Memorandum RM/14/008.

Related Software

The R package bootUR which implements several bootstrap unit root tests for (potentially) high-dimensional systems of time series - joint work with Ines Wilms.

Bootstrap Methods for Nonstationary Time Series and Panel Data

Description

The objective of my post-doc project, funded by an NWO Open Competition grant, is to develop and analyze bootstrap methods for the analysis of nonstationary time series and panel data that improve on the currently existing techniques. The bootstrap is applied as it offers better performance in small samples than asymptotic methods that are used to analyze economic and other time series. Moreover, the bootstrap offers robustness against nuisance parameters, which is especially required for panel data with cross-sectional dependence. The emphasis is on the development of new valid bootstrap methods and on studying their theoretical and practical properties in relation to the already existing techniques.

People involved

Franz Palm and Jean-Pierre Urbain were my post-doc supervisors.

Related papers

Smeekes, S. (2015). Bootstrap Sequential Tests to Determine the Order of Integration of Individual Units in a Time Series Panel. Journal of Time Series Analysis 36 (3), 398-415.
Cavaliere, G., P.C.B. Phillips, S. Smeekes and A. M. Robert Taylor (2015). Lag Length Selection for Unit Root Tests in the Presence of Nonstationary Volatility. Econometric Reviews 34 (4), 512-536.
Smeekes, S. and J.-P. Urbain (2014). On the Applicability of the Sieve Bootstrap in Time Series Panels. Oxford Bulletin of Economics and Statistics 76 (1), 139-151.
Smeekes, S. (2013). Detrending Bootstrap Unit Root Tests. Econometric Reviews 32 (8), 869-891.
Smeekes, S. and A.M.R Taylor (2012). Bootstrap Union Tests for Unit Roots in the Presence of Nonstationary Volatility. Econometric Theory 28 (2), 422-456.

Related Software

The R package bootUR which implements several bootstrap unit root tests for (potentially) high-dimensional systems of time series - joint work with Ines Wilms.

Bootstrapping Nonstationary Time Series

Description

The objective of my PhD thesis was to develop and analyze bootstrap methods for the analysis of nonstationary time series. The analysis of nonstationary time series is one of the major research topics in time series econometrics. The properties of many economic variables such as real GDP, inflation, exchange rates and stock markets change over time, making these variables nonstationary. The bootstrap is a statistical method that often performs better in small samples and is more robust than the asymptotic techniques that are used to analyze time series. However, the bootstrap was origi- nally not designed for the analysis of nonstationary time series. Therefore, applying the bootstrap in this setting is far from trivial and its properties have to be studied carefully. In this thesis the theoretical validity of the bootstrap for the analysis of unit roots and cointegration is studied. It is also investigated through simulations how the bootstrap performs in finite samples. Apart from the comparison with asymptotic methods, bootstrap methods are also compared with each other.

My PhD thesis won the Christiaan Huygens Wetenschapsprijs in 2013 and was partly funded through an NWO Open Competition grant.

People involved

Franz Palm and Jean-Pierre Urbain were my PhD supervisors.

Related papers

Smeekes, S. (2013). Detrending Bootstrap Unit Root Tests. Econometric Reviews 32 (8), 869-891.
Palm, F.C, S. Smeekes and J.-P. Urbain (2011). Cross-Sectional Dependence Robust Block Bootstrap Panel Unit Root Tests. Journal of Econometrics 163 (1), 85-104.
Palm, F.C, S. Smeekes and J.-P. Urbain (2010). A Sieve Bootstrap Test for Cointegration in a Conditional Error Correction Model. Econometric Theory 26 (3), 647-681.
Palm, F.C, S. Smeekes and J.-P. Urbain (2008). Bootstrap Unit Root Tests: Comparison and Extensions. Journal of Time Series Analysis 29 (2), 371-401.

Related Software

The R package bootUR which implements several bootstrap unit root tests for (potentially) high-dimensional systems of time series - joint work with Ines Wilms.