Which Anomalies Matter for Portfolio Construction

A Machine Learning Perspective

Main Algorithm used for Checking Variable Importance

Our analysis is based on Random Forest while using the following permutation check algorithm for checking the variable importance

Specifically, for the adopted machine learning method, which is denoted by $f$ and $f$ refers to Random Forest for our application. $\boldsymbol{X}$ refers to the normalized anomaly variables and $\boldsymbol{y}$ corresponds to the ownership of Hedge Fund portfolio.

Local Effects of Anomalies

Following graph just demonstrates local effect of the most influential anomalies on Hedge fund ownership,

