# Three Essays in Factor Analysis of Asset Pricing

## Abstract

My dissertation is comprised of three chapters. The first chapter is motivated by many lowfrequency sources of systemic risk in the economy. We propose a two-stage learning procedure to construct a high-frequency (i.e., daily) systemic risk factor from a cross-section of low-frequency (i.e., monthly) risk sources. In the first stage, we use a Kalman-Filter approach to synthesize the information about systemic risk contained in 19 different proxies for systemic risk. The low frequency (i.e., monthly) Bayesian factor can predict the cross-section of stock returns out of sample. In particular, a strategy that goes long the quintile portfolio with the highest exposure to the Bayesian factor and short the quintile portfolio with the lowest exposure to the Bayesian factor yields a Fama–French–Carhart alpha of 1.7% per month (20.4% annualized). The second stage is to convert this low frequency Bayesian factor into a high-frequency factor. We use textual analysis Word2Vec that reads the headlines and abstracts of all daily articles from the business section of the New York Times from 1980 to 2016 to collect distributional information on a per word basis and store it in high-dimensional vectors. These vectors are then used in a LASSO model to predict the Bayesian factor. The result is a series of coefficients that can then be used to produce a high-frequency estimate of the Bayesian factor of systemic risk. This high-frequency indicator is validated in several ways including by showing how well it captures the 2008 crisis. We also find that the high frequency factor is priced in the cross-section of stock returns and able to predict large swings in the VIX using a quantile regression approach, which sheds some light on the puzzling relation between the macro-economy and stock market volatility. The second chapter of my dissertation provides a basic quantitative description of a compendium of macro economic variables based on their ability to predict bond returns and stock returns . We use three methods( asymptotic PCA, LASSO and Support Vector Machine) to construct factors out of 133 monthly time series of economic activity spanning a period from 1996:1 to 2015:12 and classify these factors into two groups: bond demand factors and bond supply factors. In PCA regression, we find both demand factors and supply factors are unspanned by bond yields and have stronger predictability power for future bond excess returns than CP factors. This predictability finding is confirmed and enhanced by machine learning technique LASSO and Support Vector Machine. More interestingly, LASSO can be used to identify 15 most important economic variables and give direct economic explanations of predictors for bond returns. Regarding to stock predictability, we find both demand and supply PC factors are priced by the cross-section of stock returns. In particular, portfolios with highest exposure to aggregate supply factor outperform portfolios with lowest exposure to aggregate supply factor 1.8% per month while portfolios with lowest exposure to aggregate demand factor outperform portfolios with highest exposure to aggregate demand factor 2.1% per month. The finding is consistent with ”fly to safety” explanation. Furthermore, variance decomposition from VAR shows that demand factors are much more important than supply factors in explaining asset returns. Finally, we incorporate demand factors and supply factors into macrofinance affine term structure (MTSMs) to estimate market price of risk of factors and find that demand factors affect level risk and supply factors affect slope risk. Moreover, MTSMs enable us to decompose bond yields into expectation component and yield risk premium component and we find MTSMs without macro factors under-estimate yield risk premium. The third chapter,coauthored with Dmitriy Muravyev and Aurelio Vasquez, is motived from the fact that a typical stock has hundreds of listed options. We use principal component analysis (PCA) to preserve their rich information content while reducing dimensionality. Applying PCA to implied volatility surfaces across all US stocks, we find that the first five components capture most of the variation. The aggregate PC factor that combines only the first three components predicts future stock returns up to six months with a monthly alpha of about 1%; results are similar out-of-sample. In joint regressions, the aggregate PC factor drives out all of the popular option-based predictors of stock returns. Perhaps, the aggregate factor better aggregates option price information. However, shorting costs in the underlying drive out the aggregate factor’s predictive ability. This result is consistent with the hypothesis that option prices predict future stock returns primarily because they reflect short sale constraints.