Abstract blue and white swirl

High-Dimensional Machine Learning Models for Forecasting Equity Returns

Event Details

Tuesday, April 27 2021
11:30am-12:30pm Eastern Time (ET)


Zhenming Liu
Assistant Professor in Computer Science, College of William & Mary

Bio: Zhenming Liu is an Assistant Professor in the Computer Science Department in the College of William & Mary. He received his PhD from Harvard in 2012, worked as a postdoc in Princeton University, and as a quant researcher for Two Sigma Investment before returning to academics. His primary interests are developing machine learning algorithms for high-dimensional and low signal-to-noise datasets and developing scalable (learning) systems. He is a recipient of an Rutherford Visiting Fellowship from the Alan Turing Institute in London. His works have received a best paper award from FAST 2019, a best data science paper award from SDM 2019, and a best paper runner-up award from INFOCOM 2015.

Abstract: This talk will describe building machine learning models for equity returns. The goal is to forecast the future returns of all stocks in a specific universe, like the S&P 500 or the Russell 3000 Index, by using the so-called “cross-asset” models, in which one stock’s feature is used to predict another stock’s future return (e.g., using Google’s trading activities to forecast Facebook’s return). A tension exists between overfitting and needing to use non-linear models: there’s a desire for the model to be simple; otherwise, it would suffer from overfitting, but there’s also a desire for the model to be complex (non-linear) so that it could properly capture complex feature interactions.

Liu will describe two models to address this tension. He starts by examining a linear low-rank model, in which the number of features can be significantly larger than the number of observations. In this setting, standard low-rank methods such as reduced rank regression or nuclear-norm based regularizations fail to work, so he designs a simple and provably optimal algorithm under a mild average case assumption over the features, and tests it on an equity dataset. Next, he develops a more advanced, semi-parametric model that enables him to decouple the overfitting problem from the non-linear learning problem so that he can orchestrate high-dim techniques to learn stock-interactions and modern machine learning algorithms to learn non-linear feature-interactions.

Liu also will briefly explain the challenge he faces in developing equity return models in an academic setting, and why these models may/may not work in live trading.