Sparse Models for Dependent Data
The goal of this mini

course is to introduce sparse models and its applications to dependent data.
Sparse
modeling is a research
area
which links
statistics, machine

learning and signal processing,
motivated by the old
,
and important,
statistical problem of variable selectio
n in high

dimensional
datasets.
Selection of a small set of highly predictive variables is central to many applications where the
ultimate objective is to enhance our understanding of underlying data
generating process. More
recently, sparse modeling has became popular also in econometrics
, where apart from simple
predictions, the identification of causal relations is of tantamount importance
.
In recent years a vast number of models and/or algorithm
s has been proposed,
mainly focused on l1

regularized optimization. Examples include sparse regression, such
as LASSO
and its various exte
nsions
(Elastic Net, fused LASSO, group LASSO, simultaneous/multi

task LASSO, adaptive LASSO
, etc.), sparse
graphica
l model selection, sparse dimensionality reduction (sparse PCA, CCA, NMF, etc.) and learning
dictionaries that allow sparse representations. Applications of these methods are wide

ranging,
including
economics, finance, marketing,
computational biology, ne
uroscience, image processing,
etc
.
The course will be organized as follows:
Lecture 1: May 14
Introduction to learning methods and econometrics for dependent data: parametric versus
nonparametric
. References: [
1
0
]
, [21]
Sparse dimensionality reduction
(factor models, PCA and sparse PCA).
References:
[
1
0
]
, [21]
Sparse regression models and algorithms (Elastic Net, fused LASSO, group LASSO,
simultaneous/multi

task LASSO, adaptive LASSO, etc.)
. References:
[1], [2],
[7], [8], [10], [11],
[13], [14], [1
5], [16], [18], [21]
, [22], [23], [24], [25]
Lecture 1: May 15
Applications of sparse modeling to time

series data
. References:
[9], [12], [17], [19], [20]
Sparse algorithms for instrumental variables and generalized method of moments estimation.
References:
[3], [4].
References
:
[1]
Belloni, A. and
V.
Chernozhukov (forthcoming). Least Squares After Model Selection in High

dimensional Sparse Models. Bernoulli.
[2]
Belloni, A. and V
.
Chernozhukov (
2011
).
L1

Penalized Quantile Regression i
n High

Dimensional
Sparse Model
.
The Annals of Statistics, 39: 82

130.
[3]
Belloni, A.,
V
.
Chernozhukov
, and C. Hansen
(
2011
).
Lasso Methods for Gaussian Instrumental
Variables Models
.
Working paper.
[4]
Caner
, M. (2009)
. Lasso

type GMM estimator. Econometric Theory, 2
5(01):270
–
290, 2009.
[5]
Caner
, M.
and K. Knight
(2008)
. No country for old unity root tests: bridge estimators
differentiate between nonstationary versus stationary mo
d
els and
select optimal lag
. Working
Paper, University of Toronto.
[6]
Fan
, J.
and R. Li
(2001)
. Variable selection via nonconcave penalized likelihood and
its oracle
properties. Journal of the American Statistical Association, 96:
1348
–
1360
.
[7]
Efro
n, B., I. Johnstone, T. Hastie and
R. Tibshirani (2004). Least Angle Regression
. The Annals
of St
atistics, 32:
407
–
499.
[8]
Fan
, J.
and H. Peng
(2004)
. Nonconcave penalized likelihood with a diverging number
of
parameters. The Annals of
Statistics, 32(3):928
–
961
.
[9]
Gelper
, S.
and C. Croux. Time series least angle regression for selecting
predictive
economic
sentiment series, 20
09. Working Paper, University of Rotte
rdam
(www.econ.kuleuven.be/sarah.gelper/public).
[10]
Hastie, T., R. Tibishirani and
J. Friedman (2009). The Elements of Statistical Learning: Data
Mining, Inference, and Prediction. Springer.
[11]
Hastie
, T.
and H. Zou
(2005)
. Regularization and variable selection via the elastic
net. Journal
of the Royal Statistical Soci
ety. Series B (Methodological),
67
:301
–
320
.
[12]
Hsu,
N.,
H. Hung, and Y. Chang
(2008)
. Subset selection for vector autoregressive
pro
cesses
using lasso. Computational Statistics & Data Analysis, 52(7):
3645
–
3657.
[13]
Huang,
J.,
S. Ma,
and C.

H. Shang (2008). Adaptive LASSO
for sparse high
dimensional
regression models. Statistica Sinica, 18:1603
–
1618.
[14]
Huang,
J.,
J. Horowitz, and S. Ma
(20
09)
. Asymptotic properties of bridge estimators
in sparse
high

dimensional regression models. Annals of Statistics,
36(2):587
–
613
.
[15]
Tibshirani, R. (1996). Regression Shrin
kage and Selection Via the LASSO
. Journal of the Royal
Sta
tistical Society, Series B, 58:
267

288.
[16]
Knight
, K.
and W. Fu
(2000)
. Asymptotics for lasso

type estimators. The Annals of S
tatistics,
28(5):1356
–
1378
.
[17]
Liao
, Z.
and P. Phillips
(2010)
. Automated estimation of vector error correction
models. Work
in progr
ess
.
[18]
Meinshausen
, N.
and B. Yu
(2009)
. Lasso

type recovery of sparse representations for high
dimensional data. The Annals of Statistics, 37:246
–
270.
[19]
Nardi
, Y.
and A. Rinaldo
(2011)
. Autoregressive process modeling via the lasso procedure.
Journal of Mul
tivar
iate Analysis, 102:528
–
549
.
[20]
Song
, S.
and P. J. Bickel
(2011)
. Large vector autor
egressions. ArXiv e

prints
.
[21]
van der Geer
, S. and P. Bü
hlmann
(2011)
. Statistics for High

Dimensional Data:
Methods,
Theory and Applications. Spring
Series in Statistics. Springer
.
[22]
Wang,
H.,
G. Li, and C. Tsa
i (2007)
. Regression coefficient and autoregressive order
shrinkage
and selection via the lasso. Journal of the Royal Statistical
Society: Series B(Statistical
Methodology), 69(1):63
–
78, 2007.
[23]
Yuan
, M.
and Y. Lin
(2006)
. Model selection and estimation in regression with
grouped
variables. Journal of the Royal Statistical Society. Series B
(Methodological), 68:49
–
67, 2006.
[24]
Zhao
, P.
and B. Yu
(2006)
. On model consistency of lasso. Journal of Mach
ine
Learning
Research, 7:2541
–
2563, 2006.
[25]
Zou
, H. (2006)
. The adaptive lasso and its oracle prope
rt
ies. Journal of the American
Statistical
Association, 101:1418
–
1429
.
Comments 0
Log in to post a comment