As the last few lines of the preface for this article says "an excellent and comprehensive survey of the challenges one meets in using quantitative methods for portfolio construction and forecasting." This is for non-technical audience but extremely relevant to technical people too, as an appetizer. The right 'putting pin on the board' article to read, before you indulge in your own personal niche research.
Ch 1: Forecasting financial markets
A price/return process is predictable if its distribution depends on present information set and is unpredictable if its distribution is time-invariant.Market's partial predictability is theoretically inevitable. Markets are not made up of rational agents but agents practicing bounded rationality.
Ch 2: General Equilibrium Theories - concepts and applicability
Market equilibrium is not unique. Same asset in various states can have different equilibrium values.
Ch 3: Extended framework for Applying Modern Portfolio Theory
Mean variance optimization under Robustness is computationally efficient and is a second order conic problem. Approaches are - empirical, factor bases, clustering, Bayesian, stochastic volatility. Departure from normality is addressed using Monte Carlo. Concentrates on dispersion and downside risk.
Ch 4 : Equity Tactical strategies (2d-2m)
CAPM (2 fund separation theorem) is a unconditional, static model. The world is conditional, dynamic. Bounded rationality and asymmetry of information creates repeated patterns.
1) delayed response: Leader companies are effected by news first, which then diffuses to lagged companies Kanas and Kouretas (2005). Bhargava and Malhotra (2006) use co-integration to give a definitive answer to the distribution of agent response to same information.
2) momentum: 3 to 12 months. Lo and MacKinlay (1990) analyze a tool to detect whether momentum depend on individual asset or cross-auto-correlation effects.
3) Reversal: less used and more potential. Timing the end of momentum is a necessary ingredient. Jegadeesh and Titman (1993, 2001) and Lewellen(2005) document it.
4) Co-integration and Mean Reversion: Returns can't be co-integrated (stationary), prices are. Identifying pairs is the common approach but not the most successful. Common trends and co-integration relationships should be explored overall.
Static models are not predictive, dynamic are. Only simple dynamic models can be statistically estimated. This is restricted by limited data for complex models and risks and transaction cost considerations.
Ch 5: Equity Strategic Ideas (mths - yrs)
Aggregation over time: fractal idea is not completely right. Regressive and auto-regressive models are defined by time horizon of correlation and auto-correlation decay. GARCH does not remain invariant after time aggregation. Regime-shift models also exhibit time scale. Using high frequency correlation to estimate long term correlations.
Market behavior at different time horizon: Day and weeks – depend on trading practices and how traders react to news, long run – quantity of money, global economic performance.
Recognizing regime shifts: discrete shift(e.g. Hamiltonian model) vs continuous shift (e.g. GARCH).
Estimating Approximate models on moving windows: Once regime break is detected, the moving window should become agile to account for that. Static model causes exponentially diverging price, which empirically is a Pareto distribution. Dynamic model can do right long term modeling. Linear models can't capture long term regime shifts, but only periodic movements with a fixed period.
Nonlinear coupling of two dynamic models have been successful – GARCH, Hamilton, Markov.
Mean Reversion of log of prices: compounding – If auto-correlation less than 1, process oscillates around a trend.
Variance ratio test: If the variance grows less rapidly with time, there is mean reversion (Lo MacKinlay 1988). Differentiate trend stationary process and random walk with linear trend. The variance of a random walk keeps on growing with time, but the variance of a trend-stationary process remains constant.
Central tendency - Stock prices can only have linear or stochastic trend.
Time diversification – Less risky in long run vs short run.
Ch 6: Machine Learning
Machine learning is progressive learning. AI is useful, but its merits and limits are now more clearly understood. The link between algorithmic process and creative problem solving is the concept of searching.
Neural Network – To generalize layers and notes have to be restricted. Hertz, Krogh and Palmer (1991) give mathematical introduction. Used widely with mild success.
SVM – Performance generally superior to NN.
Classification, regression Trees – satisfactory results using CART.
Genetic Algorithms – satisfactory to remarkable prediction. Used in asset allocation (Armano, Marchesi and Murru 2005). Used to select predictors for NN (Thawornwong and Enke 2004).
Text mining - Automatic text handling has shown promise.
The possibility of replacing intuition and judgment is remote. ML is more extensive and nonlinear handling of broader set of data.
Ch 7: Model Selection, Data snooping, over-fitting, and model risk
Alpha ideas need creativity but testing and analysis of models is a well-defined method with scientific foundation. Simpler models are always better than complex models if equally explainable. Trade-off between model complexity and forecasting ability. To avoid looking for chance patterns, one must stick rigorously to the paradigm of statistical tests. Exceptional patters (over smaller subset) are generally spurious. Good practice calls for testing any model against a surrogate random sample generated with a the same statistical characteristics as the empirical sample. Out of sample testing should be the norm. Try a new model on already used data will 'always involve some data snooping'. Re-sampling and cross validation should be used for parameter selection.
If the market is evolutionary (changing slowly) it can be calibrated on moving windows. If it has regime shifts Morkov switching models can be used or alternatively random coefficient models (Longford 1993) could be used (averaging the results of different models).
Ch 8: Predictive Models of Return
Risk and returns are partially predictable. The more difficult question is how return predictability can be turned into a profit - to keep risk return trade-off positive.
1) Regressive models - regress return on predictive factors
Two types of dependence of Y on X - first, distribution of Y depend on X and expected value of Y depend on X - second, distribution of Y depend on X, but expected value of y does not depend on X. Concept of regression does not imply any notion of time - time dependence must be checked, e.g. correlation between noise terms. R-square determines the fitness.
a) Static regressive models - not predictive, like CAPM, but uncover unconditional dependence.
b) Predictive regressive models - lagged predictive regression. Distributed lag models to uncover rate of change.
2) Linear Auto-regressive models - a variable is regressed on its own lagged past values. One variable is called AR and multiple variable is called VAR. Vector auto-regressive models can capture cross-auto-correlations, should be used with few factors to reduce number of variables. Two types:
a) stable VAR models - generate stationary process. Response to stable VAR model to each shock is a sum of exponentially decays (with exponent <1) - e.g. EWMA, with most recent having the most effect. Some solutions may be oscillatory which some damped but all remain stationary exhibiting auto-correlation.
b) unstable VAR models - explosive (exponent >1) or integrated (exponent =1). In integrated process, shocks accumulate and never decay (e.g. Log price), error terms are auto-correlated. Generally, first difference gives a stationary process. VAR process with individually integrated series may have a linear combination that is stationary - this is called co-integration (Engle and Granger 1987). Regression of non-stationary (e.g. trends) time series can't rely on R-square or detrending. If both sides have integrated processes test for co-integration, to ascertain meaningful regression. For n integrated processes with k co-integrating relationships (of possible 1 to n-1), there will be n-k common trends such that every other solution can be expressed as a linear regression on these common trends.
3) Dynamic Factor Models - predictive regressive models where predictive factors follow a VAR model. These are compact formulation of state-space model with - observable and hidden state variables. Motivation is generally to reduce dimensionality. Both integrated and stable processes can be modeled. The solution are sum of exponential. Ability to mix levels and differences add to forecasting possibilities.
4) Hidden-variable models - best known linear state-space models are ARCH/GARCH. nonlinear state-space models are MS-VAR (Markov switching vector auto-regressive) family. e.g. Hamilton model (Hamilton 1989) - one random walk with drift for economic expansion and other with a smaller drift for periods of economic recession - switching regime model.
Ch 9: Model EstimationDetermining the parameter of the model. Robust estimation is becoming more important to discount noisy inputs. Being a function of data an estimator is a random variable and hence have a sampling distribution. Finite data size is a challenge. Estimation always have a probabilistic interpretation. Three methods:
1) Least-square estimation - OLS method of best projection on subspace.
2) Maximum-Likelihood - maximizing the likelihood of sample given assumption of the underlying distribution.
3) Bayesian estimation - posterior distribution is prior distribution multiplied by the likelihood. One need to know the form of distribution.
Matrices - PCA can be used to reduce dimension, select smaller set of factors and reduce noise.
Regression models - optimal linear estimator OLS, if residuals are normal it coincides with ML. Auto-correlation of residuals does not invalidate OLS, but makes it less efficient. Can be made more robust by making it less sensitive to outliers. Variable selection can be done by penalized objective functions (Ridge, Lasso, elastic net). Clustering makes estimates more robust by averaging - shrinkage, random coefficient model, contextual regressions (Sorensen, Hua, Qian 2005).
Vector auto-regressive models - multivariate OLS. For specific assumptions and limits ML can be used. Unstable VAR is generally concerned with co-integration detection and uses state of art ML method (Johansen 1991, Banerjee and Hendry 1992). Bayesian VAR (BVAR) can be used e.g. Litterman (1986) method.
Liner hidden-variable models - Kalman filter. Two main methods - ML based and subspace based.
Nonlinear hidden-variable models - MS-VAR can be estimated using EM algorithm.
Ch 10: Practical consideration with optimizersoftwares are available but understanding of the procedure is important for robustness and accuracy. Fortunately in finance, most problems have one unique optimal solution. Standard forms are - linear, quadratic, convex, conic, nonlinear. Markowitz's optimization is quadratic. Quadratic, convex and conic have unique solutions.
Solving the optimization problem - formulate problem, choose optimizer, solve problem. Re-sampled optimization can be used.
Ch 11: Industry surveyUS $4 trillion asset under management. 2005. US and European managers.
Equity return forecasting techniques - Simple methods with economic intuition preferred. Momentum and reversal are most widely used. Regression using predictors like financial ratios is the bread and butter. Desire to combine company fundamentals to sentiments. Auto-regressive, co-integration, state-space, regime-switching, nonlinear methods (NN, DT) are central in some companies. Growing interest in high-frequency data.
Models based on exogenous predictors - operating efficiency, financial strength, earning quality, capital expenditure etc as core bottom-up equity model. Fundamental combined with momentum/reversal models.
Momentum and reversal models - use of multiple time horizon, most widely used, turnover is a concern - weighting and penalty functions used to mitigate it.
co-integration models - performance sensitive to liquidity and volatility. many firms use it as it models short term dynamics and long term equilibrium.
Markov-switching/regime-switching models - not used widely as market timing is difficult to predict.
auto-regressive models - Not been widely evaluated! A step ahead of momentum models. Over-fitting is cautioned.
state-space models - not widely used.
nonlinear models - potential in NN, DT. Some already use it.
models of higher-moment dynamics - e.g GARCH, used by very few.
Model risk mitigation techniques - mostly co-variance matrix estimation.
Bayesian estimation - hard to implement explicitly but used implicitly.
Shrinkage/Averaging - widely used.
random coefficient models - hardly used. randomization of data to ensure against over-fitting used.
Optimization techniques - half the firms use it, other half distrust it.
Robust optimization - re-sampling methods
multistage stochastic optimization - hardly used.