Journal of Econometrics ( ) –
Contents lists available at ScienceDirect
Journal of Econometrics journal homepage: www.elsevier.com/locate/jeconom
Toward optimal model averaging in regression models with time series errors
Tzu-Chang F. Cheng a, Ching-Kang Ing b, Shu-Hui Yu c,∗ a University of Illinois at Urbana–Champaign, United States b Academia Sinica and National Taiwan University, Taiwan c National University of Kaohsiung, Taiwan a r t i c l e i n f o
Available online xxxx
Autocovariance-corrected Mallows model averaging
Banded Cholesky factorization
Feasible generalized least squares estimator
High-dimensional covariance matrix
Time series errors a b s t r a c t
Consider a regression model with infinitely many parameters and time series errors. We are interested in choosing weights for averaging across generalized least squares (GLS) estimators obtained from a set of approximating models. However, GLS estimators, depending on the unknown inverse covariance matrix of the errors, are usually infeasible. We therefore construct feasible generalized least squares (FGLS) estimators using a consistent estimator of the unknown inverse matrix. Based on this inverse covariance matrix estimator and FGLS estimators, we develop a feasible autocovariance-corrected Mallows model averaging criterion to select weights, thereby providing an FGLS model averaging estimator of the true regression function. We show that the generalized squared error loss of our averaging estimator is asymptotically equivalent to the minimum one among those of GLS model averaging estimators with the weight vectors belonging to a continuous set, which includes the discrete weight set used in Hansen (2007) as its proper subset. © 2015 Elsevier B.V. All rights reserved. 1. Introduction
This article is concerned with the implementation of model averaging methods in regression models with time series errors.
We are interested in choosing weights for averaging across generalized least squares (GLS) estimators obtained from a set of approximating models for the true regression function. However,
GLS estimators, depending on the unknown covariance matrix Σ−1n of the errors, are usually infeasible, where n is the sample size. We therefore construct feasible generalized least squares (FGLS) estimators using a consistent estimator of Σ−1n . Based on this inverse covariance matrix estimator and FGLS estimators, we develop a feasible autocovariance-corrected Mallows model averaging (FAMMA) criterion to select weights, thereby providing an FGLS model averaging estimator of the regression function.
We show that the generalized squared error loss of our averaging estimator is asymptotically equivalent to theminimumone among those of GLS model averaging estimators with the weight vectors belonging to a continuous set, which includes the discrete weight set used in Hansen (2007) as its proper subset. ∗ Corresponding author.
E-mail address: email@example.com (S.-H. Yu).
LetM be the number of approximatingmodels. If theweight set only contains standard unit vectors in RM , then selection ofweights formodel averaging is equivalent to selection ofmodels. Therefore, model selection can be viewed as a special case of model averaging. It is shown in Hansen (2007, p. 1179) that when the weight set is rich enough, the optimal model averaging estimator usually outperforms the one obtained from the optimal single model, providing ample reason to conduct model averaging. Another vivid example demonstrating the advantage of model averaging over model selection is given by Yang (2007, Section 6.2.1, Figure 5).
In the case of independent errors, asymptotic efficiency results for model selection have been reported extensively, even when the errors are heteroskedastic or regression functions are serially correlated. For the regression model with i.i.d. Gaussian errors,
Shibata (1981) showed that Mallows’ Cp (Mallows, 1973) and
Akaike information criterion (AIC; Akaike, 1974) lead to asymptotically efficient estimators of the regression function. By making use of Whittle’s (1960) moment bounds for quadratic forms in independent variables, Li (1987) established the asymptotic efficiency of Mallows’ Cp under much weaker assumptions on homogeneous errors. Li’s (1987) result was subsequently extended by
Andrews (1991) to heteroscedastic errors. There are also asymptotic efficiency results established in situation where regression functions are serially correlated. Assuming that the data are generated from an infinite order autoregressive (AR(∞)) process driven http://dx.doi.org/10.1016/j.jeconom.2015.03.026 0304-4076/© 2015 Elsevier B.V. All rights reserved. 2 T.-C.F. Cheng et al. / Journal of Econometrics ( ) – by i.i.d. Gaussian noise, Shibata (1980) showed that AIC is asymptotically efficient for independent-realization prediction. This result was extended to non-Gaussian AR(∞) processes by Lee and
Karagrigoriou (2001). Ing and Wei (2005) showed that AIC is also asymptotically efficient for same-realization prediction. Ing (2007) further pointed out that the same property holds for amodification of Rissanen’s accumulated prediction error (APE, Rissanen, 1986) criterion.
Asymptotic efficiency results for model averaging have also attracted much recent attention from econometricians and statisticians. Hansen (2007) proposed the Mallows model averaging (MMA) criterion, which selects weights for averaging across LS estimators. Under regression models with i.i.d. explanatory vectors and errors, he proved that the averaging estimator obtained from the MMA criterion asymptotically attains the minimum squared error loss among those of the LS model averaging estimators with the weight vectors contained in a discrete set Hn(N) (see (2.8)), in which N is a positive integer and related to the moment restrictions of the errors. Using the same weight set, Hansen and
Racine (2012) and Liu and Okui (2013), respectively, showed that the Jackknife model averaging (JMA) criterion and feasible HRCp criterion yield asymptotically efficient LS model averaging estimators in regression models with independent explanatory vectors and heteroscedastic errors. Since Hn(N) is quite restrictive when