Topics on Big Data Econometrics

Lectures Given at WISE/SOE Xiamen University, April 2019

Kuan-Pin Lin
Professor Emeritus of Economics
Portland State University

Introduction

Economic data observations come in different forms and structures. Data structures such as cross sections, time series, and panel data are familiar in economics. Based on economic theory and statistical methods, econometrics addresses issues of parameter estimates and causal inference. With the advances of information technology and rapid growth of data collection in size and scale, current state of econometric analysis faces the challenge of using massive datasets or big data. In particular, data analytics based on machine learning are considered from the perspective of applied econometric analysis.

There are two directions of research on Big Data Econometrics. First, considering the case of ever growing size of data (N -> ∞), methods of data exploration, visualization, and analysis are called to meet the demand for policy evaluation and predictive applications. Secondly, taking consideration of a broader scope of information and granular data collection, the modern econometric analysis involves the implementation of high-dimensional controls or covariates (that is, p > N). In the former direction of development, econometricians are open to methodologies of machine learning and data mining. Techniques such as bagging, boosting, random forests, neural networks, in addition to traditional regression and classification methods are avialable. For a high dimensional econometric model, regularization such as LASSO and related methods are used. Model selection and evaluation based on cross validation is recommended. With the increasing size of economic data and problem dimensions and demanding for parallel processing, it makes sense to explore a least cost option of parallel computing in the cloud.

Topics

Case Studies (subject to change)

Suggested Readings

Expectation


Copyright© Kuan-Pin Lin
(Last updated: 4/28/2019)