Collaborative Research: Statistical Inference for High-Frequency Data

$204,436FY2017MPSNSF

University Of Chicago, Chicago IL

Investigators

Abstract

To pursue the promise of the big data revolution, the current project is concerned with a particular form of such data, high frequency data (HFD), where series of observations can see data updates in fractions of milliseconds. With technological advances in data collection, HFD occurs in medicine (from neuroscience to patient care), finance and economics, geosciences (such as earthquake data), marine science (fishing and shipping), and other areas. The research focuses on how to extract information from complex big data and how to turn data into knowledge. In particular, the project aims to develop cutting-edge mathematics and statistical methodology to uncover the dependence structure governing a HFD system. The new dependence structure will permit the "borrowing" of information from adjacent time periods, and also from other series from a panel of data. It is expected that the results will lead to more efficient estimators and better prediction and that this approach will form a new paradigm for HFD. In addition to developing a general theory, the project is concerned with applications to financial data, including risk management, forecasting, and portfolio management. More precise estimators, with improved margins of error, will be useful in all these areas of finance. The results are expected to be of interest to investors, regulators, and policymakers, and the results are entirely in the public domain. The goal of this project is to create a unified framework for inference in high frequency data, based on dividing the observations and the parameter process into blocks. The work pursues two paths, both involving the fundamental structure of the data architecture. A "within block" approach uses contiguity to make the structure of the observations more accessible in local neighborhoods. The "between block" approach sets up a tool for using stochastic calculus to study the relationship between parameters in blocks that are adjacent (in time and space). It also permits the integration of high and low frequency models. This is achieved without altering current models. A final part of the project is devoted to further study of the observed asymptotic variance, in particular work on tuning parameters and inferential interpretation. Both the "within block" and "between block" approaches are formulated to cover general time varying "parameters" that are usually estimated from high frequency data series, not only volatility, but also skewness (leverage effect), regression coefficients, and parameter dynamics (such as volatility of volatility). In both cases, the observed data and also parameter processes may have large dimension (large panel size) in addition to high frequency observation. The within block approach permits contiguity to be stated jointly for the latent underlying processes and the microstructure/observation noise. For the between block approach, the investigators will further develop a new way to look at the dependence relationships between the parameters.

View original record on NSF Award Search →