tag:blogger.com,1999:blog-32990571741973310192016-12-17T12:17:57.289-08:00Summary of the financial papersMy notes on the recent papers and book chapters, I read.Mhttp://www.blogger.com/profile/04296475349435748706noreply@blogger.comBlogger19125tag:blogger.com,1999:blog-3299057174197331019.post-18961336561609970292016-12-17T12:13:00.000-08:002016-12-17T12:17:57.296-08:00Independence and Exchangeability<br />Bayesian statistics differs from frequentist statistics in its treatment of unknown values. Bayesian statistics regards probability as an epistemic concept. Under this approach, unknown parameters are given a prior probability distribution. This contrasts with the frequentists approach where parameters are regarded as unknown constants. Indeed, under the epistemic interpretation, the notion of an unknown constant is a contradiction in terms.<br /><br />In classical frequentist statistics, the samples are often supposed to be formed by independent and identically distributed random variables (iid), while in Bayesian statistics they can only be considered as such if conditioned to the parameter value, which is based on the notion of exchangeability. For example, coin tosses are independent given the numerical value of the probability of Heads, p. Without the knowledge of the numerical value of p, the trials are exchangeable and hence are conditionally independent only if given the value of p. This is the essence of the celebrated Bruno De Finetti's Representation Theorem from 1937.<br /><br />This theorem asserts that if $\mathbf{x}$ is exchangeable, then it can be represented as a Naive Bayes' model with the latent parent variable representing some meta-parameter, i.e. the $x_i$s are independent given the value of the parameter. In other words, the elements of $\mathbf{x}$ are IID, conditional on the meta-parameter indexing the distribution of $\mathbf{x}$. Hence, this Representation theorem shows how statistical models emerge in a Bayesian context: under the hypothesis of exchangeability of the observables $\{X_i\}^{\infty}_{i=1}$, there is a parameter $\Theta$ such that, given the value of $\Theta$, the observables are conditionally independent and identically distributed. Moreover, De Finetti's strong law shows that our opinion about the unobservable $\Theta$, is the opinion about the limit of $\bar{X}_n$ as $n$ tends to $\infty$.<br /><br />Mhttp://www.blogger.com/profile/04296475349435748706noreply@blogger.com0tag:blogger.com,1999:blog-3299057174197331019.post-71885400748248046702015-10-03T06:30:00.001-07:002015-10-03T13:34:12.369-07:00Momentum signals in the term structure of commodity futures - Boons, Prado 2015<script type="text/x-mathjax-config"> MathJax.Hub.Config({tex2jax: {inlineMath: [['$','$'], ['\\(','\\)']]}}); </script> <script src="http://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML" type="text/javascript"> </script><span style="background-color: white;">Basis-momentum (the difference between the momentum of nearby and next nearby contracts) strongly predicts spot returns. It also predicts the spread return. These returns are beyond the classical momentum and carry returns for commodity futures. This does not depend on the presence of institutional investors in commodity markets. </span><br /><span style="background-color: white;"></span><br /><h4><span style="background-color: white;"> Introduction</span></h4><span style="background-color: white;">Literature states that cross-sectional variation in commodity futures returns in largely driven by the characteristics basis (carry) and momentum. Portfolio sorted on basis-momentum predicts both outright and spread with an IR of around 1. This is 12-1 kind of momentum on the cross-section. Basis momentum effectively captures the interaction effect between basis and momentum. The motivation for looking at basis-momentum is that there should be additional information in the decision of producers, consumers, and speculators as to where in the futures curve they take their positions, due to seasonality in production and demand. </span><br /><span style="background-color: white;"></span><br /><h4><span style="background-color: white;"> Methodology</span></h4><span style="background-color: white;">Continuous contracts are rolled on the last day of the month before expiry. The basis is defined as $B(t)=\frac{F_{T_1}(t)}{F_{T_2}(t)}-1$. The momentum is defined as $M(t)=\prod_{s=t-11}^{t-1}(1+r_{T_1}(s))-1$. Finally, the basis momentum is $BM(t)=\prod_{s=t-11}^{t-1}(1+r_{T_1}(s))-\prod_{s=t-11}^{t-1}(1+r_{T_2}(s))$ and spread return momentum is $SM(t)=\prod_{s=t-11}^{t-1}(1+r_{T_1-T_2}(s))-1$. Spread returns are defined as $r_{T_1-T_2}(t)=\frac{(F_{T_1}(t)-F_{T_2}(t))-(F_{T_1}(t-1)-F_{T_2}(t-1))}{F_{T_1}(t-1)}$</span><br /><span style="background-color: white;"></span><br /><span style="background-color: white;">We see that $$r_{T_1-T_2}(t) = r_{T_1}(t)-r_{T_2}(t) + r_{T_2}(t)\frac{B(t-1)}{1+B(t-1)}.$$ which translates to $$ SM(t) = BM(t) + \sum\left(r_{T_2}(t)\frac{B(t-1)}{1+B(t-1)}\right).$$<span style="font-size: small;"> The second term is the interaction effect, which consists of next nearby momentum and carry momentum.</span></span><br /><span style="background-color: white; font-size: small;"></span><br /><span style="font-size: small;"><span style="background-color: white;">A large literature shows that sorting commodities on the basis (carry) leads to large spot returns. Szymanowska (2014) show that basis also predicts spreading returns. Similarly, a large literature shows that sorting commodities on momentum leads to large spot returns as well. Szymanowska (2014) show that momentum do not predict spreading returns. This paper shows that sorting commodities based on basis momentum outperforms the previous two. Persistence in the tilting of the term structure is what basis-momentum tries to capture. </span></span><br /><span style="background-color: white; font-size: small;"></span><br /><h4><span style="background-color: white;">Tests and results</span></h4><ol><span style="background-color: white;"></span><li><span style="background-color: white;">Does Basis-momentum predict returns in the cross-section?: We regress the spot and spread returns over the three factors Basis, momentum and basis-momentum in two regressions. - We see that all three signals have predictability but it is basis-momentum which beats them all. Basis momentum is the only factor predicting cross-sectional spreading returns.</span></li><li><span style="background-color: white;">Is Basis-momentum a priced risk factor?: We do time series regressions to determine whether the basis-momentum factors are spanned by basis and momentum factors. Then we conduct Fama-MacBeth cross-sectional regressions for commodity factor pricing models containing basis, momentum and basis-momentum. - basis momentum provides the best </span>Sharpe of 0.93 for spot and 0.99 for spreading returns. <span style="background-color: white;"></span></li></ol>Mhttp://www.blogger.com/profile/04296475349435748706noreply@blogger.com0tag:blogger.com,1999:blog-3299057174197331019.post-39566472998614961642015-10-03T06:26:00.001-07:002015-10-07T18:44:05.125-07:00Currency Momentum Strategies <script type="text/x-mathjax-config"> MathJax.Hub.Config({tex2jax: {inlineMath: [['$','$'], ['\\(','\\)']]}}); </script> <script src="http://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML" type="text/javascript"> </script>Menkhoff, Sarno, Schmeling, Schrimpf 2011<br /><br /><h4>Abstract</h4><div>Significant cross-sectional spread gives excess returns of 10% pa, not explained by traditional risk factors but explained by under and over reactions of investors. Different from carry trade.<br /><br /></div><h4>Introduction</h4><div>Momentum in stocks poses challenge to standard finance theory. Apart from conventional risk-factors, factors like credit risk/bankruptcy risk, limits to arbitrage, under reaction, or high transaction costs have been proposed. </div><div><br /></div><div>FX time series momentum strategies like moving average cross-overs, filter rules, channel breakouts deteriorate over time. FX cross-sectional strategies are less examined. We study 1976 - 2010 with 48 currencies. We decompose these momentum returns into systematic and unsystematic risk components, compare momentum strategies to carry and trading rules, qualify the importance of transaction cost and investigating non-standard sources of momentum returns like under- and over- reaction and limits to arbitrage.</div><div><br /></div><div>We find evidence of return continuation and subsequent reversal over 36 months. These are different from carry returns and technical trading rules. Momentum profits are skewed towards currencies with high transaction costs. But these returns are not systematically related to standard proxies for business cycle risk, liquidity risk, carry trade risk factor, volatility risk, three Fama-French factors, Carhart four factor. These profits vary significantly over time suggesting limit to arbitrage. Momentum in countries with higher risk rating tend to yield significantly positive excess returns. Similar effect is found for a measure of exchange rate stability risk.<br /><br /></div><h4>Related Literature</h4><div><i>Stock market momentum </i>- We established empirically, explained by<br /><br /><ol><li>risk-based and characteristic-based explanations: not linked to macroeconomic risk, but firm-specific risks, e.g. stronger in smaller firms, firms with lower credit rating, firms with higher revenue growth volatility, firms with higher likelihood to go bankrupt.</li><li>behavioral biases: investor's under reaction to news, weak analyst coverage causes stronger momentum.</li><li>Transaction costs or limit to arbitrage: reasonably high transaction costs may wipe out momentum profits.</li></ol><div><i>Bonds and commodities momentum</i> - Momentum strategies don't work for investment grade bonds or bonds at the country level, but yield positive returns for non-investment grade corporate bonds. Momentum returns are not related to liquidity but seem to reflect default risk in the winner and loser portfolios. Commodities high momentum returns are related to low levels of inventories.</div><div><br /></div><div><i>Currency momentum</i> - Mostly time series momentum has been analyzed. </div><div><ol><li>Technical trading in FX markets: highly correlated to trend following. Filter rules (like go long if moving returns are >1%) and moving average cross-over rules seem to work. This has slowed down recently.</li><li>Contribution of this paper: cross-sectional momentum of FX and its analysis.</li></ol><h4>Data and currency portfolio</h4></div><div>spot and 1 month forward rate from 1976-2010, end of month data. 48 countries. Interest rate differential (forward discount) contribute a significant share of the excess return of currency investments. We track pure spot returns as well to identify source of momentum. The long short portfolio is dollar neutral.<br /><br /><h4>Characterizing Currency Momentum Returns</h4></div></div><div><br /><ol><li>Returns to Momentum strategies in currency markets - Returns driven by spot rates momentum and not mostly driven by interest rate changes (like for carry trades), especially for 1 year momentum with 1 month holding period. (1,1) is the best of the all. Though the cross-section of currencies is small relative to equities, the performance is still good because of much lower correlations in the currencies vs equities. </li><li>Out of sample perspective - do specific momentum strategies identified to be attractive in-sample continue to do well? Out of the universe of 144 strategies, we look for momentum in the lagged momentum returns! We find that 1 month lagged best portfolio is equally good (0.94) and hence can be seen as an out of sample test. These strategies have been stable over time.</li><li>Comparing momentum and technical trading rules - moving average cross overs of 1-20, 1-50 and 1-200 is used as a proxy for technical trading strategies (IR from 0.88 to 0.77). These are correlated to momentum but there is significant economic alpha. Similarly the cross-sectional momentum strategy has alpha over time series momentum strategies as well. </li><li>Comparing Currency momentum and the carry trade - Interest rate differentials are strongly auto-correlated and spot rate changes do not seem to adjust to compensate for this interest rate differential (forward rate puzzle). Hence, it may be the case that lagged high returns simply proxy for lagged high interest rate differentials and that cross-sectional momentum is simply carry. We show that that is not the case. Carry trade has negative skewness while momentum has slightly positive skewness. The high-low momentum strategies are uncorrelated with high-low carry strategies. Double sorting ( divide currencies into two portfolios based on median lagged forward discount and then divided each into three portfolios based on lagged returns) shows no material difference in long-short momentum returns among high vs low interest rate currencies. Cross-sectional Fama-Macbeth regression of currency excess returns on lagged excess returns over the last $l$-months, lagged forward discounts and lagged spot rate changes for each month show that lagged spot returns explain the regressions.</li><li>Post-formation momentum returns - Initial under-reaction is accompanied by over-reaction which gets corrected over the long run. This causes reversal over longer periods. There is a clear pattern of increasing returns which peaks after 8-12 months across strategies and a subsequent period of declining excess returns, more pronounced for momentum strategies with longer formation periods, suggesting equity and currency momentum have similar origins.</li></ol><div>Currency momentum seem similar to equity momentum. But the highly liquid FX markets are dominated by professional traders, where irrationality should be quickly arbitraged away. Hence examining possible limits to arbitrage activity which could explain the persistence of momentum profits in FX markets. </div></div><div><br /></div><h4>Understanding the results</h4><div><ol><li>Transaction cost - full bid-ask spread used. The 1,1 momentum returns from 10 to 4 percent. FX momentum strategies are much more profitable in the later part of the sample, but they do not always deliver high returns. There is much variation in profitability. Transaction costs can be decomposed into turnover across portfolios and bid-ask spreads across portfolio. Turnover can be extremely high for 1,1 momentum strategy, up to 70% per month. Winner and loser currencies do have higher transaction costs than the average exchange rate and the markup ranges from about 2.5 to 7 basis points per month. Transaction costs have declined over time due to more efficient trading technologies. This could imply (i) higher momentum returns due to lower trading costs (ii) lower momentum returns since lower cost facilitates more capital being deployed for arbitrage activity. Looking at 1,1 strategy for 1992 to 2010, we find profitability. Thus, lower bid-ask spreads do not necessarily lead to lower excess returns, which further indicate that trading costs are not the sole driving force behind momentum returns. Also suggesting that momentum returns are a phenomenon which is still exploitable.</li><li>Momentum returns and Business cycle risk - Various univariate regressions on business cycle state variables - real growth in non-durables and service consuption expenditures, nonfarm employment growth, ISM manufacturing index, real industrial production, inflation rate, real money balances, growth in real disposable personal income, TED spread (3m libor - t-bill rate), term spread (20y - 3m tbill rate), carry trade long-short portfolio, global FX volatility - yield no explanation power. Regression on Fama-French three factors is also not explanatory. </li><li>Limit to Arbitrage: Time-variation in momentum profitability - 36 months moving window returns plot shows that there is time variation in performance. Hence, investor seeking to profit from momentum returns has to have a long enough investment horizon. Since the bulk of currency speculation is accounted for by professional market participants with rather short horizon. </li><li>Limit to Arbitrage: Idiosyncratic volatility - We investigate whether momentum returns are different between currencies with high or low idiosyncratic volatility (relative to an FX asset pricing model). When we double sort with respect to lagged idiosyncratic volatility and returns we find high idiosyncratic volatility explain higher returns.</li><li>Limit to Arbitrage: Country risk - we sort on a measure of country risk and a measure of exchange rate stability risk. Data based on International Country Risk Guide (ICRG) database from the Political Risk Services group. We employ relative to US values. Momentum returns are significantly positive and always larger in high-risk countries than in low-risk countries. Hence country risk should be an important limit to arbitrage activity in FX markets. These risk ratings are not simple proxies for interest rate differentials, because the country risk and exchange rate risk are high both for winner and loser momentum currencies. Sorting based on forward discount show that country risk highest for carry trade target countries and lowest for carry trade funding currency. For top 15 developed countries, the momentum returns are non-existent after transaction cost. </li></ol><h4>Robustness and additional tests</h4></div><div><br /><ol><li>Capital account restrictions and readability - </li></ol></div>Mhttp://www.blogger.com/profile/04296475349435748706noreply@blogger.com0tag:blogger.com,1999:blog-3299057174197331019.post-6134175204386197162015-09-17T19:13:00.001-07:002015-09-17T19:13:49.542-07:00Diversified Statistical Abritrage: Dynamically combining mean reversion and momentum investment strategies - James Velissaris 2010<h4> Abstract</h4>A dynamically adjusted strategy between mean-reversion and momentum (2008, 2009). Stocks are grouped together using PCA. The idiosyncratic returns is calculated by comparing the returns of the stock to the returns of the entire group. This residual return often oscillates around a long-term mean. This strategy is dollar neutral and have high turnover. The medium-term momentum strategy trade the 9 sector ETFs, based on technical trading rules. Dynamic allocation was done between the 11 strategies, with rebalancing at the end of each month. Out of sample IR of 2.27, with beta 35%<br /><br /><h4>Equity mean reversion model</h4>The decomposition of the stock returns is given by $$r_t = \alpha + \sum_{j=1}^n \beta_j F_t + \epsilon_t.$$ PCA of the normalized returns (after data centering and normalization in 252 day moving window) is used and the first 12 factors are retained. The Eigenportfolio returns $F_{jt}$ are given by $\sum_i \frac{v^{(j)}_i}{\sigma_i}R_{it}$. We, further, neglect the drift in returns. The model we implement is $dX_t=k(m-X_t)dt+\sigma dW_t$. The mean reversion time is $\tau = 1/k$. Use stock with mean reversion within 20 days, and for the s-score $s=\frac{X_t-m}{\sigma_{eq}}$ at +1.25 go short and get out at +0.75 (similarly for long). Trading cost of 10 bps. The model is two-times levered per side or four-times levered gross (industry standard).<br /><br /><h4>Momentum strategy</h4>S&P500 industry sector ETFs, S&P500 ETF and SPY. 60 and 5 day exponentially moving average is used. Signal long if 5d EMA is above 60d EMA for the previous 4 or more trading days. In all other scenarios the signal is short. There is no rebalancing the trade and 10 bps cost assumed.<br /><br /><h4>In-sample analysis</h4>2005-2007 in sample show mean-reversion strategy being much better than momentum with an IR of 1.28. The equally weighted strategy has an IR of 0.49.<br /><br /><h4>Optimization and out-of-sample results</h4>There are returns to be made by dynamically optimizing the weights of different strategies. We can use Quadratic programming with the objective function and constraints as $$\min_x \frac{1}{2}x^THx+f^Tx \quad Ax \le b, \quad A_{eq}x=b_{eq}, \quad lb \le x \le ub.$$<br />An important input into the process is lower and upper bounds for each variable. Using expected returns and allocation targets, we can customize the optimization process to best suit our portfolio specifications. The goal of this optimization is to maximize the Sharpe ratio of the diversified portfolio with a penalty for marginal risk contribution. The portfolio was optimized at the end of each month using the returns from the previous 252 trading days. There was no transaction cost used, except flat 10 bps per trade. The diversified strategy IR is 2.27 vs static allocation IR of 1.56, out-of-sample. The mean reversion strategy has a beta exposure. Optimization can be used to control beta, volatility and leverage as well to control drawdowns.<br /><br /><h4>Conclusion</h4><ul><li>Potential benefit of including both mean-reversion and momentum in portfolio.</li><li>Did not hedge the beta risk using SPY, but can be done.</li><li>Momentum signal using PCA eigen-portfolios is not apparent at individual stock level.</li><li>Potentially greater alpha at finer time scales.</li><li>Varying time-scales with signal decay for both momentum and mean reversion can be useful.</li></ul>Mhttp://www.blogger.com/profile/04296475349435748706noreply@blogger.com0tag:blogger.com,1999:blog-3299057174197331019.post-26540948567740163092015-09-16T21:24:00.001-07:002015-09-16T21:24:14.703-07:00Scaling by correlation matrix<br /><br /><span style="color: black;"><span style="color: black;">We analyze the effect of scaling a signal by the inverse of correlation matrix here. We start by assuming that the two assets </span><span style="color: green;">$A_1$</span><span style="color: black;"> and </span><span style="color: green;">$A_2$</span><span style="color: black;"> have unit variance. This reduces the co-variance matrix to correlation matrix. We assume a simple correlation matrix of the form </span><span style="color: green;">$$\begin{bmatrix} 1 & c \\ c & 1 \end{bmatrix}.$$</span><span style="color: black;"> Now let's say we have generated a signal of </span><span style="color: green;">$\mu_1$</span><span style="color: black;"> and </span><span style="color: green;">$\mu_2$</span><span style="color: black;"> for the two assets before scaling. This means that the unscaled </span><span style="color: black;">portfolio can be written as </span><span style="color: green;">$$\mu_1 A_1 + \mu_2 A_2.$$</span><span style="color: black;"> Now the inverse of the correlation matrix is </span><span style="color: green;">$$\frac{1}{1-c^2}\begin{bmatrix} 1 & -c \\ -c & 1\end{bmatrix}.$$</span><span style="color: black;"> This makes the scaled signal (<span style="color: #38761d;">$\Sigma^{-1}\mu$</span>) </span><span style="color: green;">$$\frac{\mu_1-c\mu_2}{1-c^2}A_1+\frac{\mu_2-c\mu_1}{1-c^2}A_2.$$</span><span style="color: black;"> We can see that based on the 'original signal' (</span><span style="color: green;">$\mu_1$</span><span style="color: black;"> and </span><span style="color: green;">$\mu_2$</span><span style="color: black;">) and the correlation value (</span><span style="color: green;">$c$</span><span style="color: black;">) the 'scaled signal' is altered. Another way to look at the 'scaled signal' is to write the portfolio as </span><span style="color: green;">$$\mu_1\left[\frac{1}{1-c^2}A_1-\frac{c}{1-c^2}A_2\right] + \mu_2\left[\frac{1}{1-c^2}A_1-\frac{c}{1-c^2}A_2\right].$$</span><span style="color: black;"> This is another way of saying that we trade the same original signal but replace the assets </span><span style="color: green;">$A_1$</span><span style="color: black;"> and </span><span style="color: green;">$A_2$</span><span style="color: black;"> with the spreads </span><span style="color: green;">$\left[\frac{1}{1-c^2}A_1-\frac{c}{1-c^2}A_2\right]$</span><span style="color: black;"> and </span><span style="color: green;">$\left[\frac{1}{1-c^2}A_2-\frac{c}{1-c^2}A_1\right]$</span><span style="color: black;">. In the table below we look at this 'spread' for different values of correlation coefficient </span><span style="color: green;">$c$</span><span style="color: black;">. We also see the 'altered' signal value for the assets <span style="color: #38761d;">$A_1$</span> and <span style="color: #38761d;">$A_2$</span>.</span></span><br />$$<br />\begin{array}{c|cc|cc}<br />c & \text{$\mu_1$} & \text{$\mu_2$} &A_1 & A_2 \\<br />\hline<br />+0.9 & 5.3A_1-4.7A_2 & 5.3A_2-4.7A_1 & 5.3\mu_1-4.7\mu_2 & 5.3\mu_2-4.7\mu_1 \\<br />+0.5 & 1.3A_1-0.7A_2 & 1.3A_2-0.7A_1 & 1.3\mu_1-0.7\mu_2 & 1.3\mu_2-0.7\mu_1 \\<br />+0.1 & 1.0A_1-0.1A_2 & 1.0A_2-0.1A_1 & 1.0\mu_1-0.1\mu_2& 1.0\mu-0.1\mu \\<br />0.0 & A_1 & A_2 & \mu_1 & \mu_2\\<br />-0.1 & 1.0A_1+0.1A_2 & 1.0A_2+0.1A_1 & 1.0\mu_1+0.1\mu_2 & 1.0\mu_2+0.1\mu_1 \\<br />-0.5 & 1.3A_1+0.7A_2 & 1.3A_2+0.7A_1 & 1.3\mu_1+0.7\mu_2 & 1.3\mu_2+0.7\mu_1 \\<br />-0.9 & 5.3A_1+4.7A_2 & 5.3A_2+4.7A_1 & 5.3\mu_1+4.7\mu_2 & 5.3\mu_2+4.7\mu_1<br />\end{array}<br />$$<br /><span style="color: black;">For the case of high absolute correlations, till </span><span style="color: green;">$\mu_1$</span><span style="color: black;"> and </span><span style="color: green;">$\mu_2$</span><span style="color: black;"> are comparable the total portfolio values are within limits. But if </span><span style="color: green;">$\mu_1$</span><span style="color: black;"> and </span><span style="color: green;">$\mu_2$</span><span style="color: black;"> differ substantially huge positive and negative positions can be created, which may be undesirable. This is a likely scenario as signals are based on recent updated information while the correlations rely on slow window. </span><br /><br />What if we add a third asset <span style="color: #38761d;">$A_3$</span> with signal <span style="color: #38761d;">$\mu_3$</span> which is uncorrelated to the first two assets? We have the correlation matrix as <span style="color: #38761d;">$$\begin{bmatrix} 1 & c & 0 \\ c & 1 & 0 \\ 0 & 0 & 1 \end{bmatrix},$$</span> the inverse of this matrix is <span style="color: #38761d;">$$\frac{1}{1-c^2}\begin{bmatrix}1 & -c & 0\\ -c & 1 & 0 \\ 0 & 0 & 1-c^2\end{bmatrix}.$$</span> This results in the following 'altered' portfolio <span style="color: green;">$$\frac{\mu_1-c\mu_2}{1-c^2}A_1+\frac{\mu_2-c\mu_1}{1-c^2}A_2+\mu_3A_3.$$ <span style="color: black;">This shows that the signal of the uncorrelated asset is not changed.</span></span>Mhttp://www.blogger.com/profile/04296475349435748706noreply@blogger.com0tag:blogger.com,1999:blog-3299057174197331019.post-36301787190306586662015-09-16T21:23:00.001-07:002015-09-16T21:23:28.336-07:00Pairs trading the commodity futures curve - Antti Nikkanen<span style="color: #38761d;">Notes on Antti Nikkanen Master's thesis Aug 2012</span><br /><br /><h2>Ch1. Introduction</h2>Commodity futures trading strategy, which exploits the roll returns of commodity futures as its main driver of excess return. To minimize the volatility of returns, pairs trading methodology is used to trade the futures curve, with a Sharpe of 3. Liquidity is taken into account with trading cost of 3.3 bps. Commodity is still unknown because of lack of good data, it being a derivative security, short maturity claim on a real asset and have pronounced seasonality in prices levels and volatility. <br /><br /><h2>Ch2. Literature Review</h2>Hong and Yogo (2012) show that <span style="color: red;">aggregate basis</span> (ratio of futures price to commodity price) is the most important predictor of commodity returns. The main factor behind the fluctuation of the aggregate basis is <span style="color: red;">hedging pressure</span> (how much producers short commodity futures to hedge their long positions in the underlying spot). <br /><br />Erb and Harvey (2006) show that roll returns explain more than 90% of long-run cross-sectional variation of commodity futures returns over 1982-2004. The time-series variation of future returns is mostly explained by spot price movement. To become spot neutral the author creates spreads. <br /><br />Fuertes and Miffre (2010) show tactical position of shorting contangoed and long backwarded futures. They also include momentum. <br /><br />Gorton and Rouwenhorst (2005) state that the commodity futures returns are negatively correlated with those of equity and bond returns. But this low correlation exists only in 'normal' markets. The spread strategy reduces correlation even in 'abnormal' markets. <br /><br /><h2>Ch3. Theory</h2>Commodity markets do not fit the CAPM (Bodie and Rosansky 1980) because it is difficult to make a distinction between systematic risk/return and unsystematic risk/return. Also, the price is dependent on demand and supply factors, not perceived adequate risk premiums. <br /><br />Stocks (like Finnish mining company Talvivaara) follow closely the price of underlying commodity (nickel). But many companies, especially the oil companies have hedged away its oil exposure e.g. ExxonMobile. With commodity ETFs there may be large tracking error e.g. USO is an oil ETF but lagged massively the movements in oil prices after the 2008 crash due to rolling the portfolio in times of negative roll returns. GLD on the other hand tracks the spot gold quite closely. <br /><br />Less than 1% of futures contract result in a delivery of the underlying asset. Commodity futures do not represent direct exposures to actual commodities. They are bets on expected future spot prices (Gourton and Rouwenhorst 2005). The relationship between the futures and spot price is <span style="color: red;">$F=Se^{(r+c-y)(T-t)}$</span>, where $r$ is the risk free rate, $c$ is the storage cost (storage facilities, insurance, inspections, transportation and maintenance, spoilage and financing), $y$ is the convenience yield (ability to profit from local supply demand imbalances, leasing of gold to jewelry manufacturers).<br /><br /><h4>Economics of backwardation and contango</h4>Upward sloping (contango) and downward sloping (backwardation) are determined by demand, supply and seasonal changes. For a hedger who is inherently long (petroleum producer long on crude through exposure to oil exploration, developing refining and marketing), speculators are going to take the long risk if the price is sufficiently discounted vs spot price, i.e they are in backwardation. (Anson 2009). Contango occurs for commodities in which the hedger is inherently short to the exposure of commodity (e.g. aircraft manufacturers that does not have aluminum mines, willing to purchase the futures contract of a future aluminum delivery). Hence, profits for the speculator is determined by the amount the hedgers have interest for risk capital, not the long-term price trends of the commodity markets (Anson 2009).<br /><br /><div class="separator" style="clear: both; text-align: center;"><a href="http://2.bp.blogspot.com/-t8iG1orGmWY/VfCg0x-dhzI/AAAAAAAAAAY/7daV8mJ2v4c/s1600/spot_vs_future.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="255" src="http://2.bp.blogspot.com/-t8iG1orGmWY/VfCg0x-dhzI/AAAAAAAAAAY/7daV8mJ2v4c/s320/spot_vs_future.png" width="320" /></a></div><br /><span style="color: #38761d;">Hicks' rational expectations hypothesis</span> states that the price of an asset for delivery in future must be the market's current forecast of the spot price on the future delivery date (spot does not move in presence of any further information). This has proven not to be useful practically. <span style="color: #38761d;">Storage models</span> have been better at explaining practicality, which states that relationship between the spot and future depends on storage levels and expected storage levels in the future (i.e. inventory). This mean there is an expectation of the spot price to move as well through maturity. A difficult to store commodity (NG) has steep forward curve. When inventories are high relative to demand, the curve will be upward-sloping and when tight downward-sloping (Till, Feldman 2006). These, difficult to store commodities (HO, HG, LC, LH) have the highest average excess returns versus easy to store commodities. <br /><br /><h4>Commodity futures returns composition</h4>Commodity returns is the sum of spot return, risk-free rate and roll return. Commodity markets are usually favorable for sudden spot price rises but show mean-reverting tendency over longer periods. <br /><br /><h4>CTAs</h4>Generally trend following, in contrast to market timing strategies where statistical techniques are used to predict the trends before they become apparent. Managed futures strategies are either technical or fundamental in either systematic or discretionary manner. Most do technical systematically. <span style="color: #38761d;">Bridgewater, an exception, does fundamental systematically, e.g. in 2008 they spotted the possibility for either an inflationary or a deflationary deleveraging through contraction in private credit growth, declining stock market and a widening credit spread and adjusted their positions based on 1920s Germany, 1980s Latin American inflationary deleveraging and the deflationary deleveraging of Great depression in the 1930s and Japan in 1990s (Schwager 2012).</span><br /><span style="color: #38761d;"></span><br /><h4>A hedge against inflation</h4>In inflationary periods, usually long commodity future positions benefit and stock and bond returns are negatively impacted, because the purchasing power of the money declines and earning power of the corporation erodes. <br /><br /><h4>Pairs Trading</h4><span style="color: black;">Johansen test can check the cointegration of multiple time series at a time. It is a relative strategy and does not care about absolute value of the assets. With stocks, it is more common that just one of the assets is over or under priced (Gatev, Goetzmann, Rouwenhorst 2006). For futures curve, even the underpriced contracts when in contango, usually have a negative expected return. </span><br /><br />The main reason to pairs trade the future curve is to hedge price movement risk and only capture the part of the commodity futures roll return. This strategy could be made dynamically adjusting to be more profitable.<br /><br />For two time series to move together there needs to be something called the error correction, which causes correction of prices and hence mean reversion. Usually the order of integration is first determined with a unit root test before running an actual cointegration test (crucial to check with common sense and graphics). Augmented Dickey-Fuller test takes care of the autocorrelation in the difference variable series. Johansen test is based on the error-correction representation of the VAR equation and testing for reduced rank and then using Granger's representation theorem to get the cointegration vector. <br /><br /><h2>Ch4. Empirical work</h2>1991 to 2012. Daily frequency of 12 nearest contracts of 20 commodities. Transaction cost of 3.3 bps per leg per trade and contracts with open interest less than 20000 not traded. <br /><br /><h4>Methodology</h4><ol><li>Determine the shape (contango vs backwardation) by taking the difference of the first five contracts, and taking an average of them. $$\frac{1}{5}\sum_{i=1}^5(f_i-f_{i+1}).$$</li><li>If the result is positive (backwardation), go long the 'most' backwarded contract (maximum absolute slope), which is equivalently the most out of its path regarding its cointegration with the other data points in the curve. The position is taken onto the further contract. </li><li>The short position is determined by taking the smallest value of differenced contracts and going short on the further contract.</li><li>The pair is chosen only if both have open interest more than 20000.</li><li>If contango, the process is same but reversed. Take position into the largest difference and a long position into the smallest absolute difference. </li><li>At the start of each month the portfolio is set up for next 30 days, with equal weights.</li></ol>All the commodity curves are found to be cointegrated. The information ratio is 3.1 for monthly rebalancing. All assets show positive returns. This can be bifurcated between roll returns (alpha genration) and hedged returns (to reduce volatility). Feeder cattle is invested only 3% of the time period while CL is invested 100%. daily traded strategy is similar with more trading cost, but good returns. <br /><br /><h4>Improvements</h4><ol><li>The current strategy is suboptimal in terms of when to trade. </li><li>Entry should be based on price deviations form the equilibrium level. </li><li>Best 5 instead of all would produce better results. </li><li>To choose the 'hedging pair' from the real difference of the futures price and not the absolute price difference. This would capture the, though rare, instances where the futures curve has elements of both backwardation and contango.</li></ol>Mhttp://www.blogger.com/profile/04296475349435748706noreply@blogger.com0tag:blogger.com,1999:blog-3299057174197331019.post-10430258162283205822015-09-16T21:22:00.001-07:002015-09-16T21:22:40.283-07:00Four Essays in Stat Arb - Jozef Rudy<span style="color: #38761d;">These are my notes on the phd thesis 'Four Essay in Statistical Arbitrage in Equity Markets' by Jozef Rudy. Hoping to implement some of these eventually.</span><br /><span style="color: #38761d;"><br /></span><br /><h2>Ch 1 - Introduction</h2>This is just a summary chapter. The work is mostly about Pairs trading and its modifications, concentrating on daily trading but also applying high frequency data and other modification. There is also a chapter on mean reversion strategies - fitting under statistical arbitrage.<br /><br />The standard market approach is daily sampling (Gatev 2006). In the standard form, the edge such strategies provide seems to be dissipating. Going to higher frequency can potentially achieve higher information (Aldridge 2009). Nonstandard half-daily sampling frequency and using ETFs can further help the performance.<br /><br /><h2>Ch 2 - Literature Review</h2><div>Nunzio Tartaglia is credited for developing pairs trading at Morgan Stanley in 1980s. Hugely successful but profits have come down recently. That is why one needs to go into higher frequencies Marshall et al. (2010). Similarly, Shulmeister (2007) finds that technical are profitable, but only on higher time frames. That motivates half-daily timeframe. </div><div><br /></div><div>Engle and Granger (1987) brought cointegration to limelight. Johansen (1988) developed the critical test. For a pair, the simpler method is to first calculate the beta using $P_{1t}=\beta P_{2t}+\epsilon_t$. Then check the residual using Augmented Dickey-Fuller unit root test (ADF) at 95% confidence using </div><div>$$\Delta \epsilon_t = \phi+\gamma\epsilon_{t-1}+\sum_{i=1}^{p}\alpha_i\Delta \epsilon_{t-1}+u_t.$$</div><div>We include the most significant lags in an iterative sense and then check for the no cointegration using $\gamma=0$, against the hypothesis $\gamma<0$. </div><div><br /></div><div>For more than two assets one need to use Johansen method. Non-parametric distance method (Gatev 2006) and stochastic approach (Mudchanatongsuk 2008) has also been used.<br /><br />Time adaptive models like Kalman filter have been shown to be superior to rolling window OLS based methods due to forward looking methodology of the former. Double exponential smoothing-based prediction based models can give comparable results to Kalman filter but run order of magnitude times faster.<br /><br />'Market neutral' hedge funds are generally pairs trading kind of funds.<br /><br /><h2>Ch 3 - Stats Arb. and HF data </h2></div><div>The main innovation is to apply statistical arbitrage technique of pairs trading to high-frequency equity data (Eurostoxx 50 stocks). This is done for 5-minute interval (IR~3) to daily frequency (IR~1). Pairs are chosen based on best in-sample IR and highest in-sample t-stats of the ADF test of the residuals of the cointegrating regression sampled at daily frequency. 5 best pairs are chosen. The simplest method is Engle and Granger (1987) cointegration approach. To make beta parameter adaptive the following techniques can be used - rolling OLS, DESP model and Kalman filter.<br /><br /><h4>Cointegration model</h4></div><div>Take pairs from same industry based on economic reasoning and apply OLS regression on them:</div><div>$$Y_t=\beta X_t + \epsilon_t$$</div><div>Then test the residuals of the OLS regression for stationarity using the Augmented Dickey-Fuller unit root test.<br /><br /><h4>Rolling OLS</h4></div><div>Similarly we can calculate the rolling beta using rolling OLS. This approach suffers from 'ghost effect', 'lagging effect' and 'drop-out-effect'. The window can be optimized for maximum in-sample IR. This was around 200 periods. This was used for out of sample.<br /><br /><h4>Double Exponential smoothing prediction model</h4></div><div>We first calculate $\beta_t=Y_t/X_t$. We then do double smoothing by:</div><div>$$S_t = \alpha \beta_t+(1-\alpha)S_{t-1}$$</div><div>$$T_t=\alpha S_t + (1-\alpha)T_{t-1}$$</div><div>Using these the prediction of beta at time period $t+1$ is </div><div>$$\hat{\beta}_{t+1} = \Bigg[2S_t-T_t\Bigg] + k \Bigg[\frac{\alpha}{1-\alpha}(S_t-T_t)\Bigg].$$</div><div>$k$ is the number of look-back periods. the optimized values of $\alpha$ and $k$ are 0.8126 and 30.<br /><br /><h4>Time-varying parameter model with Kalman filter</h4></div><div>This is more optimal than OLS for adaptive parameter estimation. The measurement equation is<br />$$Y_t=\beta_t X_t+\epsilon_t$$<br />and the state equation is<br />$$\beta_t=\beta_{t-1}+\eta_t.$$<br />The idea to add second equation is based on the intuition that there is some characteristic of beta, i.e. auto-correlation which can be added as information for better estimation. The noise ratio is to be optimized yielding $3e^{-7}$.<br /><br /><h3>The pair trading model</h3></div><div>Choosing the pairs within an industry makes us immune to industry wide shock. The spread between the pairs is calculated as $z_t=P_{Y_t}-\beta_{t}P_{X_t}$. We did not include a constant in any of the models. This spread is normalized by subtracting the mean and divided by the standard deviation. Entry is at 2 standard deviation and exit near 0.5 standard deviation. Once the entry is triggered we wait one period before we enter. We choose money neutral investment by putting equal money in the two sides (irrespective of the $\beta$). There is no re-balancing. When normalized spread returns to its long term mean, it is caused by the combination of two things: real reversal of the spread and adaptation of beta to new equilibrium value - leading to not total reversal in dollar value even when the spread has totally reversed. </div><div><br /></div><div>In sample indicators are used with the objective to identify out of sample performance:</div><div>1) t-stat from ADF test on the residuals of the OLS regression.</div><div>2) the information ratio</div><div>3) half life of mean-reversion. </div><div>The half-life is given by $-ln(2)/k$, where k is the median unbiased estimate of the strength of mean-reversion OU equation</div><div>$$dz_t = k(\mu-z_t)dt+\sigma dW_t$$</div><div>where $z_t$ is the value of the spread, $\sigma$ is the standard devation. The higher the $k$, the faster the spread tends to revert to its long term mean. In sample IR is also used as a metric (IR 2 means strategy is profitable every month, IR 3 means strategy is profitable every day). IR is overestimated if the returns are auto-correlated.<br /><br /></div><h3>Out of sample performance </h3><div>Assuming a trading cost of 30 bps one way. The best result comes out for 30 minute interval. Kalman is the best out of - fixed beta, rolling OLS, DESP and Kalman, with the smoothest beta (Table 3-3).<br /><br /><h3>Further investigations</h3></div><h4>Relationship between the in-sample t-stats and the out-of-sample information ratio</h4><div>The in-sample t-stats for the fit is positively correlated to out of sample information ratio for upto 10 minutes frequency. Beyond this the correlation is statistically indistinguishable from 0. </div><h4>Relationship between t-stats for different high-frequency and pairs</h4><div>Trading pairs have similar t-stats across all frequencies is ascertained by the first PCA explaining almost all of the variance (after standardizing the t-stats of ADF test for all pairs). This has the following implication - once a pair has been found to be co-integrated at a certain frequency, it tends to be co-integrated across all frequencies. </div><h4>Does cointegration in daily data imply higher frequency cointegration </h4><div>The correlation between t-stats (of the ADF test) of daily data and 5-minute data has an interval of [-0.03,0.33] using bootstrapping. Hence, co-integration found at daily frequency implies there is co-integration at 5-min interval as well.<br /><h4>Does in-sample information ratio and the half-life of mean reversion indicate what the out-of-sample information ratio will be?</h4> Using bootstrapping the confidence bounds indicate that the in-sample information ratio can positively predict the out-of-sample information ratio to a certain extent. Also, There is negative relation between the half-life of mean reversion and subsequent out-of-sample information ratio.<br /><br /><h3>A diversified pair trading strategy</h3></div><div>Using the indicators presented above, best 5 pairs are selected. Best in-sample IR - gives attractive the out of sample performance. Half-life of mean reversion - does not work out. In-sample t-stats of the ADF test of the cointegrating regression as indicator only works for 5 to 10 minute strategies. A combination is worse than individual indicators. Finally, a daily IR of 1.34 and high frequency IR of 3.24 comes out to be better than simple long. </div><br /><h2>Ch 4 - Profitable Pair Trading: A comparison using the S&P 100 constituent stocks and the 100 Most liquid ETFs</h2>The greatest known risk to pairs trading is a stock going bankrupt. ETFs can avoid that. But are they equally profitable? It turns out they are than stocks based on adaptive long-short strategy (IR of 1 vs 0), extending in-sample period (1.7 vs 0.2) and preselecting pairs based on in-sample IR (2.93 vs 0.46).The ratio can be made time adaptive via Kalman filter. Pairs trading strategy in its basic form might be becoming unprofitable. <br /><br />Datastream is used to get data for 100 most liquid ETFs and S&P100 stocks. In-sample period of 3/4 and 5/6 is used. Based of if there is cointegration or not 428 ETF pairs and 693 stock pairs are evaluated.<br /><br /><h4>Methodology</h4>Bollinger bands are used with 20 day moving window with 2 standard deviation windows for entry/exit triggers, in general. These parameters are optimized for max in-sample IR and differ from one pair to another. <br /><br /><h4>Model</h4>The spread is calculated using adaptive beta using Kalman filter, based on prices. By optimizing the noise ratio $Q/H$, an increase in ratio makes the beta more adaptive and decrease more smooth. Constant level is not used to reduce parameter. We invest the same amount of dollars on each side of the trade. Once invested, we wait for the spread to revert back. The initial money neutral positions are not dynamically rebalanced. <br /><br /><h4>Out of sample results </h4>With 75% in-sample the IR for ETF and stocks are 1.06 and 0.08 respectively. This increases to 1.71 and 0.22 for 83% in-sample respectively. ETFs used are index trackers, thus they contain lower idiosyncratic risk as shares. Index divergence is more probable to reverse than stock divergence, where the reason could be more fundamental. Much better results of ETFs could also be a result of a stronger autocorrelations of ETF pairs compared to shares. Lower volumes traded (only marginally) also makes ETF market less competitive<br /><br /><h4>Results for the best 50 pairs</h4>The correlation between in-sample and out-of-sample IR is 0.24 and 0.14 for ETFs and Stocks. This motivates using better performing in-sample pairs in out of sample. This increases the IR to 1.58 and 0.13 for 75% in-sample case for ETFs and Stocks respectively. And an IR of 2.93 and 0.46 for 83% in-sample case. <br /><br /><h4>Conclusions</h4><ol><li>ETFs are better than Stocks because of non-existence of idiosyncratic risk in ETFs.</li><li>Decreasing out-of-sample period improves performance. Hence, re-estimating the model once per week will improve the results. </li><li>In-sample IR predicts out of sample IR.</li></ol><br /><h2>Ch 5 - Mean Reversion based on Autocorrelation: A comparison using the S&P 100 constituents and the 100 most liquid ETFs</h2>Simple strategy based on normalized previous period's return and the actual conditional autocorrelation can give traders and edge. ETFs are more suitable than Stocks and half-daily frequency improves the performance. <br /><br /><h4>Introduction</h4><ol><li>Form pairs with 30 days trailing conditional correlation above the threshold of 0.8</li><li>Eliminate pairs with a previous day's normalized spread returns smaller than 1.</li><li>Select pairs with first order autocorrelation within certain bounds.</li></ol>Two different samplings - daily and half-daily are used, with 4 year in sample and out of sample period.<br /><br />Contrarian profits, explained by overreaction hypothesis causing negative autocorrelation, have decreased in recent periods (Khandani and Lo 2007). Higher frequencies still have some juice (Dunis et al 2010). Market neutral strategies have been shown to be exposed to general market factors. S&P 100 stocks and 100 ETFs are used with investment exactly for one trading period. <br /><br /><h4>Methodology</h4>JPMorgan (1996) method is used to calculate conditional (time-varying) volatility and conditional correlation (cutoff 0.8), over a period of 30 days. $$cov(r_A, r_B)_t=\lambda cov(r_A,r_B)_{t-1}+(1-\lambda)r_A r_B,$$ where $\lambda$ is the constant 0.94, corresponding to 30 days. The return of the spread is simply the difference of the returns of the constituents. The conditional autocorrelation of the pair is calculated as $$\rho_t=\frac{cov(r_t,r_{t-1})_t}{\sigma_t \sigma_{t-1}},$$ where $r_t$ is the returns of the spread pair. The conditional covariance of the pair is calculated as $$cov(r_t,r_{t-1})_t=\lambda cov(r_t,r_{t-1})_{t-1}+(1-\lambda)r_t r_{t-1}.$$ The normalized returns of the spread is simply $$R_t=\frac{r_t}{\sigma_t}.$$ We only trade pairs with normalized returns above 1. If the autocorrelation is negative we bet on the reversal otherwise be bet the pair will continue to move in the same direction as in current period, with each pair held only for one period. 5 best pairs with highest normalized returns are chosen. <br /><br /><h4>Trading results</h4>Trading cost of 20 bps per pair trade is assumed. Net of cost IR for in-sample and out-of-sample top 1, 5, 10 and 20 best pairs for different autocorrelation ranges is all negative for stocks. The results are positive both for in-sample and out-of-sample for ETFs (5, 10, 20 pairs) for the range -0.4 to 0 (but not -1 to -0.4). <br /><br />For half-daily frequency results are better but still not good enough for shares. For ETFs the results are stupendous for the full negative autocorrelation range. Positive autocorrelation range is not that productive.<br /><br />The out of sample results are consistent till 2009 after which it is flat. Adding more pairs makes the equity curve more consistent. <br /><br /><h2>Ch 6 - Profitable Mean Reversion after large price drops: A story of Day and Night in the S&P500, 400 Mid Cap and 600 Small Cap Indices</h2>Open-to-close (day) and close-to-open (night) have information. The worst performing shares during the day (resp. night) are bought and held during night (resp. day). The alpha is not explained by Fama and French 3-factors and Carhart 5-factors. <br /><br /><h4>Literature review</h4>Contrarian returns have been reducing (Khandani and Lo 2007). Most strategies use close to close information and don't make use of the opening prices into account. Existence of contrarian profits can be explained by overreaction hypothesis (Lo and MacKinlay 1990), with a negative autocorrelation assumption. De Bondt (1985) show that for 3 years rebalancing losers beat the past winners, with the outperformance continuing as late as 5 years after the portfolio have been formed. Predictability of short-term returns are exploited either by momentum or reversion. Serletis and Rosenberg (2009) show the Hurst exponent for the four major US stock market indices during 1971-2006 display mean-reverting behavior. Bali (2008) find that the speed of the mean reversion is higher during periods of large falls in prices. <br /><br />De Gooijer et al. (2009) find non-linear relationship between overnight price and opening price. Cliff et. al. (2008) show that night returns are positive while day returns are 0. The effect is partly driven by the higher opening prices which decline during the first trading hour of the session. <br /><br /><h4>Financial Data</h4>Stocks consisting of - S&P 500, S&P 400 MidCap and S&P 600 SmallCap are used. Data from 2000-2010 adjusted prices. 5bps trading cost one way. We calculate open-to-close day returns and close-to-open night returns. The average return of holding the shared during day and night is very similar for the constituent stocks of S&P 500 index and is slightly positive for both. For S&P 400 MidCap the daily returns are positive and overnight returns negative, similar to S&P 600 SmallCap. These differences are not profitable after trading cost. <br /><br /><h4>Trading Strategy</h4>Exploit the mean reverting behavior of the largest losers either during the day or night. Version 1 (day holding) buys n worst performing shares during the close-to-open period (decision period) with shares bought at the market open and sold at market close, equally weighted. Version 2 (night holding) buys n worst performing shares during the open-to-close period (decision period). The Benchmark strategy buys the n worst losers based on full day returns. <br /><br /><h4>Strategy Performance</h4>For S&P 600 small cap, the first two deciles (stocks with largest decline during the decision period) produce high IRs and the last two negative (a short strategy will work, which is not examined here). This holds true for both day and night strategies. There is a clear structure present going from top to bottom deciles. Overreaction is not as strong for mid cap stocks as it for small caps. But the pattern is similar and extreme deciles are profitable. <br /><br />The benchmark strategy (close to close decision period with subsequent close to close as holding period) has been unprofitable for Small, Mid and S&P500 cross section more recently. Version 1 and version 2 have been more profitable. <br /><br />Park (1995) claims that the profitability of mean reversion strategy disappears once the average bid-ask price is used instead of a closing price, i.e. the most significant part of the close-to-close contrarian strategy is caused by the bid-ask bounce and is not achievable in practice. The two versions shown here are better than the benchmark (close-to-close) and hence this strategy is immune to bid-ask bounce.<br /><br /><h4>Multi-factor Models </h4>Style factors:<br /><ul><li>CPAM model by Sharpe (1964) - market returns.</li><li>Fama and French 3-factor model (1992) - Mkt, small-big, value-growth.</li><li>Adj. Carhart's 5 factor model (1997) - Mkt, small-big, value-growth, Momentum: High returns - low returns (M2 to M12), reversion: low returns - high returns (M1). </li></ul>$\alpha$ comes out positive for each case. Momentum factor turns out to be negative while the reversal factor comes out positive, as expected. <br /><br /><h2>Ch 7 - General Conclusions</h2>Two ways to improve trading results:<br /><ol><li>Using more data - higher frequency, bigger universe. Even including opening prices can be hugely beneficial. Getting opening price and instantly process is a challenge. </li><li>Using advanced modeling - Kalman can be fast and efficient vs OLS. Factor neutralizing the pairs ratio (not only industry neutral as done here) can further improve the results. Neural networks and SVM can be used to predict the future direction of spreads instead of using fixed std. level for the spread entry specification. </li></ol>Delving more into model complexity, as opposed to data complexity, would be more beneficial.Mhttp://www.blogger.com/profile/04296475349435748706noreply@blogger.com0tag:blogger.com,1999:blog-3299057174197331019.post-4860547457766257602015-09-07T18:20:00.000-07:002015-10-06T17:16:17.679-07:00Option trading: Pricing and Volatility strategies and techniques - Euan SinclairTraders are pragmatic, interested in results. But <span style="color: #38761d;">focusing on the process can take care of the results</span>. Good traders learn this. But good traders are intellectually parsimonious due to the demand of trading. But more knowledge brings more adaptability in uncertain times.<br /><br />Derivatives traders don't need technical or fundamental analysis but sound knowledge of market structure and arbitrage relationships. Causality need to be investigated in every model.<br /><br /><h2>Ch1. History</h2><div>Options are not modern invention. They have a longer history than either stocks or bonds. Options are legal contracts, and hence subject to changes in the legal system (e.g. Dutch Tulips crash case in 1636). The South Sea bubble crash of 1720 involved a form of call options. The first exchange to list standardized contracts was the CBOE in 1973. Black-Scholes-Merton model published the same year. Options may well have been a tool in the speculative bubble, but were not the root cause. They are inevitable for modern risk management. </div><div><br /></div><h2>Ch2. Introduction to Options</h2><div>One must simply know all details of the instrument's specifications. For example, FXP gives twice the daily negative returns of FXI, does not mean the compounded returns over a period of time will have the same relationship. Key words are: options, right not obligation, underlying, premium, maturity. Options can be created out of thin air, till there is ability to collateralize it. They have nonlinear payoffs. </div><div><br /></div><h4>Specifications for an option contract</h4><div><ol><li>Option type - calls and puts. </li><li>Underlying asset - certain number of stocks, indices (times a multiple), futures.</li><li>Strike price - exercise price</li><li>Expiration date - last date on which the option exists.</li><li>Exercise style - American and European. Bermudan (on specific days).</li><li>Contract unit - multiplier. Need to be aware of the effects of corporate actions. </li></ol><h4>Uses of options</h4></div><div>Replication of options using underlying is possible but expensive so options are not redundant. The subtle difference between the option and underlying replicating portfolio is where the professional traders make money.<br /><br /><ol><li>Hedging - A position in underlying can be protected from falls by buying a protective put. Presence of hedging activity shows the fallacy in methods that use the number of outstanding puts or calls to predict the direction of any underlying security.</li><li>Speculation - If we think stocks will fall we can but a put. Out of money puts give greater leverage. </li><li>Creation of structured products - e.g. equity linked note. Investors are torn between fear and greed. Equity linked note are ideal product which promise principle and give an upside if the index is over a certain percentage.</li><li>Volatility trading - A position in options and underlying can be used to trade change of volatility(and not directions or returns).</li><li>Structured product arbitrage - Many financial products contain options like features, e.g. convertible bond. These can be replicated, hedge against or speculated using options.</li></ol><h4>Market structure</h4><div>An options trade can be put with a broker after completing Securities account, options account and Options Clearing Corporation risk disclosure agreements. Market or Limit orders for Call or Put can be placed with details provided. Main exchanges in the US are Boston, Chicago Board, International Securities, NASDAQ Options, NYSE Alternext and Philadelphia. These markets are linked on a real-time basis. Ticks are either $0.05 or $0.01 generally in the US. There is also a private inter-dealer market called the 'call-around market'. The United States equity options market is served by a single clearing house, the Options Clearing Corporation (OCC), which the exchanges collectively own. The appropriate cash transfer happens the next business day. Transaction cost includes broker and exchange commissions. The margins are of two types - strategy based margin and portfolio based margin. </div><div><br /></div><div></div><br /><h2>Ch3. Arbitrage Bounds for Option Prices</h2></div><div>Law of one price is behind these bounds. Sometimes what appears to be an arbitrage is merely a situation with larger than anticipated transaction costs, or unconsidered risk. The future price of a stock is related by $F=Se^{rT}$, where $r$ is the risk free rate. This is because of absence of arbitrage. A different borrowing and lending rate will give a no-arbitrage band instead of a value. Dividends and storage cost should be properly accommodated in the stock price. If interest rates are positively correlated with the underlying the futures are slightly more valuable than the forwards.<br /><br />We can use this information to get bounds on options, which if violated can be exploited.<br /><br /><ol><li>American options are always expensive that European, both call and put. $c\le C$ and $p\le P$.</li><li>A call can't cost more than underlying. $c\le S$.</li><li>A put can never by more than the strike price (discounted to present for European). $P\le X$ and $p\le Xe^{-rt}$.</li><li>The minimum value of call option is $c\ge S-Xe^{-rt}$. $C \ge Max(0,S-X)$.</li><li>The minimum value of put option is $p \ge Xe^{-rt}-S$. $P \ge Max(0,X-S)$.</li></ol></div>Mhttp://www.blogger.com/profile/04296475349435748706noreply@blogger.com0tag:blogger.com,1999:blog-3299057174197331019.post-51290343067782252242015-08-23T08:36:00.001-07:002015-08-27T03:37:56.772-07:00Range-based estimation of Stochastic Volatility ModelsAlizadeh, Brandt and Diebold (2001)<br /><br />Theoretically, numerically and empirically the range is not only a highly efficient volatility proxy, but also that it is approximately Gaussian and robust to microstructure noise. Two factor models - one persistent and other mean reverting - do a better job describing simultaneously the high and low frequency dynamics of volatility - to explain both autocorrelation of volatility and the volatility of volatility.<br /><br /><div>Volatility is not constant. It is both time-varying and <span style="color: #38761d;">predictable</span>. Gaussian quasi-maximum likelihood estimation (QMLE) for estimating stochastic volatility falls wayside because the volatility models are non-Gaussian - log absolute or squared returns. Range - difference of highest and lowest log security prices is a much more efficient estimator - due to its near normality.<br /><br /></div>Mhttp://www.blogger.com/profile/04296475349435748706noreply@blogger.com0tag:blogger.com,1999:blog-3299057174197331019.post-80030443823517629492015-08-16T07:40:00.000-07:002016-12-17T12:15:29.494-08:00Expected Returns on Major Asset Classes - Ilmanen (2012)<script type="text/x-mathjax-config"> MathJax.Hub.Config({tex2jax: {inlineMath: [['$','$'], ['\\(','\\)']]}}); </script> <script src="http://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML" type="text/javascript"> </script><span style="color: #38761d;">This article is written by Ilmanen from AQR. It is a very good summary of fundamental ideas (mostly smart beta) floating in different asset classes. </span><br /><br />Ch1 Introduction<br />Asset class expected returns and risk premia are time varying and somewhat predictable. <br /><br />Ch2 Equity Risk Premium<br />Ch3 Bond Risk Premium<br />Ch4 Credit Risk Premium<br />Ch5 Alternative Risk PremiumMhttp://www.blogger.com/profile/04296475349435748706noreply@blogger.com0tag:blogger.com,1999:blog-3299057174197331019.post-69625112205710776262015-08-15T12:58:00.001-07:002015-11-27T20:44:02.742-08:00Algorithmic Trading - Ernest Chan<script type="text/x-mathjax-config"> MathJax.Hub.Config({tex2jax: {inlineMath: [['$','$'], ['\\(','\\)']]}}); </script> <script src="http://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML" type="text/javascript"> </script><span style="color: #38761d;">Profits are not derived from some subtle, complicated cleverness of the strategy but from an intrinsic inefficiency in the market that is hidden in plain sight. </span><br /><br />Two kind of strategies - <span style="color: red;">Mean reversion</span> (ADF test, Hurst exponent, Variance ratio test, half-life, Johansen test) using linear, bollinger band, kalman filter - both temporal and cross sectional. <span style="color: red;">Momentum</span> (roll returns, forced asset sales and purchases, news, sentiment, order flow). <span style="color: red;">Nuances</span> - data-snooping bias, survivor ship bias, primary vs consolidated quotes, venue dependence of currency quotes, short-sales constraints, construction of futures continuous contracts, closing vs settlement prices, regime shift. Kelly criteria for <span style="color: red;">risk management</span>, risk indicators, Monte Carlo simulations. <span style="color: red;">Lessons</span> - never manually override, under-leveraged is better, strategy performance mean reverts, overconfidence in a strategy is a poison pill.<br /><h4></h4><h4></h4><h2></h2><h2></h2><h2></h2><h2></h2><h2></h2><h2></h2><h2></h2><h2></h2><h2>Ch1 - Backtesting and automated execution</h2>Statistical significance of the numbers is important to establish. Regime shifts are unputdownable, they need to be investigated. Good Backtesting platform is the the life blood of productive endevour. Common pitfalls:<br /><ol><li>Look-ahead bias - using future information to construct signal. Trading and back-testing should be on same platform</li><li>Data-snooping bias - out of sample testing, cross validation, make the model as simple as possible with as few parameters, assuming simple Gaussian, linear price predication and allocation formula. e.g. $rank_s=\sum_i^n sign(i) rank_s(i)$. Walk forward test as a final true-out-of-sample testing. After all this, be happy if the live trading generates a Sharpe better than half its backtest value.</li><li>Stock splits and Dividend adjustments - earnings.com and csidata.com</li><li>Survivorship Bias in Stock Database - get delisted stock data - kibot.com, tickdata.com, crsp.com</li><li>Primary vs Consolidated Stock Prices - use tradable data. be skeptic in a healthy way.</li><li>Venue Dependence of currency quotes - get the data where it will be traded.</li><li>Short sale constraint - hard to borrow stocks (huge short interest) should be modeled realistically. This data is broker dependent. Backtest will be inflated otherwise. </li><li>Futures continuous contracts - take care of roll jumps.</li><li>Futures close versus settlement prices - use synchronous prices, settlement is preferred.</li></ol><h4></h4><h4>Statistical significance of backtesting: Hypothesis testing</h4>Hypothesis testing: we find the probability $p$ in the tail bigger than test statistic, with null hypothesis supposing the true value is zero. A very low value rejects the 'null hypothesis' and gives credence to the number. Three methods to evaluate the probability distribution for the statistical significance of backtesting on finite sample size:<br /><br /><ol><li>Gaussian distribution: Sharpe ($\times \sqrt{n}$) of 2.32 means $p$ is less than 0.01</li><li>Monte Carlo to generate simulated historical price data and feed these simulated data into our strategy to determine empirical probability of distribution of profits.</li><li>Generate set of simulated trades, with number of long short entry trades is the same as in the backtest with same average holding as in the backtest, but distributed randomly. </li></ol><div>Failure to reject null might inspire insights, which success may be a slightly weaker preposition. Seriously flawed strategies:<br /><br /><ul><li>Annualized returns of 30% and Sharpe of 0.3 and draw down duration of 2 years.</li><li>Strategy worse than buy and hold.</li><li>Survivorship biased dataset.</li><li>Neural net with 100 nodes with Sharpe 6.</li><li>High frequency strategies with high Sharpe, not taking into account market response.</li></ul><h4>Will a backtest be predictive of future returns?</h4></div><div>Regime shift are important to determine by observing the market, and statistically if possible. </div><div><ul><li>Decimalization of US stocks in 2001. Profits of statistical strategies decreased and profits of high frequency strategies increased.</li><li>2008 crisis decreased average daily trading and caused a subsequent decrease in average volatility but increasing frequency of sudden outbursts. General decrease in profits of mean reverting strategies. Multi year bear market in momentum strategies started as well.</li><li>2007 obsolescence of the NYSE block trade and removal of old uptick rule for short sales. </li></ul><h2></h2><h2></h2><h2></h2><h2></h2><h2></h2><h2></h2><h2></h2><h2>Ch2 - The Basics of Mean Reversion</h2></div><div>Financial price series are geometrical random walk, it's the returns which which distribute around a mean of zero, but we can't trade them (but anti-serial correlation or returns which is same as mean reversion of prices can be traded). We can manufacture a lot of price series that are <span style="color: red;">mean-reverting</span> (tested using ADF test, Hurst exponent and Variance Ratio test) in prices, though the price series of individual components are not. This is called <span style="color: red;">cointegration</span>. This can be tested using (CADF test and Johansen test). This is called <span style="color: #38761d;">time-series mean-reversion</span>. The other type is <span style="color: #38761d;">cross-sectional mean reversion</span> (short-term relative returns of the instruments are serially anti-correlated).<br /><br /><span style="color: blue;">Mean-reverting series</span> means that change in the price series in the next period is proportional to the difference between the mean price and the current price. ADF test tests whether we can reject the null hypothesis that the proportionality constant is zero. A <span style="color: blue;">stationary price-series</span> has variance of the log of prices increasing slower than that of a geometric random walk, i.e. sublinear function of time. That is for $\tau^{2H}$, where $\tau$ is the time separating two price measurements, $H$ is the Hurst exponent, if less than 0.5 the price-series is stationary. The variance Ratio test can be used to see whether we can reject the null hypothesis that the Hurst exponent is actually 0.5.<br /><br /><h4>ADF test</h4></div><div>If a price series is mean reverting, then if the price level is higher than the mean, the next move will be downward and vice versa. We can describe the price change dynamics via $$\Delta p_t = \lambda p_{t-1} + (\mu + \beta t) + \sum_{i=1}^{k} \alpha_i\Delta p_{t-i}+\epsilon_t,$$ where $\Delta p_t = p_t-p_{t-1}$, etc. The ADF test has the null hypothesis of $\lambda=0$. If the null hypothesis can be rejected it means the price series is not a random walk and mean reverts. Since we expect mean regression, the test statistic $\lambda/SE(\lambda)$ has to be negative. When the model is fit we can use DW stats (equivalent to $2(1-\rho)$)to check if the residuals have autocorrelation.<br /><br /><h4>Hurst exponent and variance ratio test</h4></div><div>A stationary price series means that the prices diffuse slowly than the geometric random walk would. The variance for a time period $\tau$ is defined as $Var(\tau)=<|z(t+\tau)-z(t)|^2>$, where $z=log(p)$. For the geometric walk we know this variance is $\sim\tau$, but for mean reverting or trending process this is $\sim \tau^{2H}$, where $H$ is the Hurst exponent. $H=0.5$ for random walk, $H>0.5$ for trending series and $H<0.5$ for mean reverting series. $H$ serves as an indicator for the degree of mean reversion or trendiness. The statistical significance of $H$ can be provided by the Variance ratio test, which tests whether the following ratio is equal to 1. $$\frac{Var[z_t-z_{t-\tau}]}{\tau Var[z_t-z_{t-1}]}.$$<br /><br /><h4>Half life of mean reversion</h4></div><div>In practical trading we can be successful with less demanding tests. We just need $\lambda$ negative enough to make a trading strategy practical, even if we can't reject the null hypothesis. $\lambda$ is a measure of how long it takes for a price to mean revert. Converting the difference equation to a continuous form (ignoring the trend and the lagged differences) $$dp_t=(\lambda p_t-1+\mu)dt + d\epsilon,$$ which solves to $$E[p_t] \propto e^{\lambda t}.$$ The expected time for half decay is $-log(2)/\lambda.$ Notice that $\lambda$ us negative. This determines the natural lookback period of our strategy as well, some small multiple of half-life period to avoid brute force optimization of lookback period.<br /><br /><h4>A linear Mean-reverting trading strategy</h4></div><div>One the tests confirm mean reversion, and half-life is appropriate in terms of our holding period expectations we determine the normalized deviation of the price from its moving average (with look back period equal to half-life) and maintain number of units of assets negatively proportional to this normalized deviation. Given a price series that passed the stationarity statistical tests, or at least one with a short enough half-life, we can be assured that we can eventually find a profitable trading strategy, maybe just not the one that we had backtested.<br /><br /><h3>Cointegration</h3></div><div>We can proactively create a portfolio of individual price series so that the market value series of this portfolio is stationary. This is the notion of cointegration. The most common combination is a pair.</div><div><br /></div><h4>Cointegrated Augmented Dickey-Fuller Test (CADF) </h4><div>Since we do not know apriori what hedge ratio we should use to combine the pairs usual mean reversion test wouldn't work. Using Engle and Granger (1987) process we first determine the optimal hedge ratio by running a linear regression fit between the two price series, use this hedge ratio to form a portfolio, and then finally run a stationarity test on this portfolio. The order of the price series will change the hedge ratio (will not be exact reciprocal). Generally only one of those ratios is correct, yields the most negative t-stats.<br /><br /><h4>Johansen Test</h4></div><div>For more than two variables we need to use the Johansen test. We first present the matrix form of the equation $$\Delta P_t = \Gamma P_{t-1} + M + \sum_{i=1}^{k}A_i \Delta P_{t-i}+\pmb{\epsilon}_t. $$ If $\Gamma=\pmb{0}$ (or each eigenvalue is 0), we don not have cointegration. If the rank of the matrix is $r$ and the number of price series are $n$, then the number of independent portfolios that can be formed by various linear combinations of the cointegrating price series is equal to $r$. The Johansen test does the analysis based on trace statistic and eigen statistic. The null hypothesis $r=0$ (no cointegration relationship), should be rejected to find mean reversion, followed by $r\le 1$, ..., up to $r \le n-1$. If all these hypothesis are rejected, then we have $r=n$. The eigenvectors found can be used as our hedge ratios for the individual price series to form a stationary portfolio. <br /><br />This also reveals the inverse relationship ( not generally reciprocal), i.e. Johansen test is independent of the order of the price series. The cointegrating relationship is the strongest for the highest eigenvalue, have the shortest half-life for mean reversion.<br /><br /><h4>Linear Mean-reverting trading on a portfolio</h4></div><div>We determine the portfolio vector and accumulate units of the portfolio proportional to the z-score of the 'unit' portfolio's price, determined by Johansen eigenvector. At the outset we cannot really enter and exit an infinitesimal number of shares whenever the price moves by an infinitesimal amount. To avoid data snooping, we should determine Johansen vector in a moving window fashion (unlike in the book). The lookback can be half-life. Shorter the half-life, more significant are the results.<br /><br /><h4>Pros and cons of mean-reverting strategies</h4></div><div>Portfolio trading is the most profitable and has most opportunities. There is also often a good fundamental story behind a mean-reverting pair. Canadian and Australian market are cointegrated because they are commodities economy. GDX and GLD cointegrate because the value of gold-mining companies is very much based on the value of gold. Even when a cointegrating pair falls apart, we can often understand the reason. And with understanding comes remedy. This availability of fundamental reasoning is in contrast to many momentum strategies whose only justification is that there are investors who are slower than we are in reacting to the news, i.e. there are greater fools out there. Mean-reverting strategies also span a great variety of time scales. </div><div><br /></div><div>Unfortunately, it is because of the seemingly high consistency of mean-reverting strategy that may lead to sudden break down. This often happens when the leverage is maximum after an unbroken string of successes. Hence, risk management is particularly important and difficult since usual stop losses cannot be logically deployed. </div><div><br /></div><h2></h2><h2></h2><h2>Ch3 - Implementing Mean Reversion Strategies</h2><div>In practice, we do not necessarily need true stationarity or cointegration in order to implement a successful mean reversion strategy. We can capture short-term or seasonal mean reversion (during specific period of the day or under specific conditions), and liquidate our positions before the prices go to their next equilibrium level. Conversely, not all stationary series will lead to great profits, particularly when the half life is longer. A more practical version of the implementation is using Bollinger bands. Kalman filter can be used for better estimation of the hedge ratio.<br /><br /><h4>Trading pairs using price spreads, log price spreads, or ratios</h4></div><div>Consider the 'unit' portfolio time series $y$ as the trading signal, which is just a weighted sum. Instead of prices we could also use log prices (with different coefficients estimated, of course if stationary). Unlike prices, using log prices would not represent shares of the portfolio or constituents. To understand the log price relationship we take a time difference of this equation. The difference of this series gives linear combination of returns.<br /><br />Price based portfolio's constants represent number of shares, while return based series' constants represent the market value of the assets together with a cash component implicitly included. Note that a cash component must be implicitly included because the constants are the market values, and there is no other way that the market value of the portfolio can change with time. This cash does not show up in the difference equation because it is constant from $t-1$ to $t$ and is rebalanced then. This requires the trader to constantly rebalance the portfolio, which is necessitated by using the log of prices.<br /><br />The ratio $p_1/p_2$ does not necessarily form a stationary series, but may have advantage when the underlying pair is not truly cointegrating, but there is short term mean reversion present. This also keeps the hedge ratio 1. This may come handy during currency trading, where ratios of currency pairs may have a real meaning.<br /><br /><h4>Bollinger Bands</h4></div><div>The linear strategy deployed till now is not practical as it does not limit the deployed capital. Bollinger bands can be used to state the entry Z-score and exit Z-scores. The performance of the example improves but there are additional parameters introduced.<br /><br /><h4>Does Scaling-in work?</h4></div><div>Scaling-in/ averaging-in is the idea that one invests more as the price deviates more from the mean (assuming mean reversion happens). This is what a linear mean-reversal strategy does. This reduces price impact and can make profits even when the price never reverts to its mean. Multiple entry exits using Bollinger bands and mimic the situation. Schoenberg and Corwin (2010) show that entering or exiting at two more more Bollinger bands is never optimal, with the implicit assumption that probabilities of changes are constant, which is not the case. Practically, scaling in may well outperform the all-in method out-of-sample.</div><div><br /></div><h4>Kalman Filter as Dynamic Linear Regression</h4><div>What is the best way to estimate hedge ratio when it can vary with time? Moving window can have ghost effects, entry-drop off effects. EWM can improve this, but it is not clear if it is optimal. Kalman filter is an optimal linear algorithm that updates the expected value of a hidden variable based on the latest value of an observable variable. If we assume the noises are Gaussian and relationships are linear, it is the best filter available. We need to figure out observable and hidden variable and observation and state transition model matrices. In the measurement equation $$y_t=x_t \beta_t + \epsilon_t,$$ $y_t$ is the observable price, $x_t$ the price series augmented with ones($N\times 2$ matrix) is the observation model matrix. $\beta_t$ is the $2\times 1$ hidden variable denoting both the intercept and the slope. $V_{\epsilon}$ is the variance of the Gaussian noise $\epsilon_t$. Next we make a crucial assumption that the regression coefficient at time $t$ is the same as that at time $t-1$ plus noise $$\beta_t=\beta_{t-1}+\omega_{t-1},$$ where $\omega$ is also a Gaussian noise with covariance $V_{\omega}$, i.e. the state transition model here is just the identity matrix.<br /><br />Kalman filter can now iteratively generate the expected value of the hidden variables $\beta$ given an observation at time $t$, which not only includes the dynamic hedge ratio between the two assets, but also the 'moving average' of the spread. It also generates an estimate of the standard deviation of the forecast error of the observable variable which can be used in place of the moving standard deviation of the Bollinger band! We also need to specify $V_{\epsilon}$ and $V_{\omega}$.<br /><br />If $R(t|t-1)$ is $Cov(\beta_t-\hat{\beta}(t|t-1))$, measuring the covariance of the error of the hidden variable estimates we have given the quantities $\hat{\beta}(t-1|t-1)$ and $R(t-1|t-1)$ at time $t-1$, $$\hat{\beta}(t|t-1)=\hat{\beta}(t-1|t-1) \quad (\mbox{State prediction})$$ $$R(t|t-1)=R(t-1|t-1)+V_{\omega} \quad (\mbox{State Covariance prediction})$$ $$\hat{y}(t)=x(t)\hat{\beta}(t|t-1)\quad (\mbox{Measurement prediction})$$ $$Q(t)=x(t)^TR(t|t-1)x(t)+V_{\epsilon}\quad (\mbox{Measurement Variance prediction}),$$ where $\epsilon(t)=y(t)-x(t)\hat{\beta}(t|t-1)$ is the forecast error for $y(t)$ given observations at $t-1$, and $Q(t)$ is $Var(\epsilon(t))$, measuring the variance of the forecast error. After observing the measurement at time $t$, Kalman filter update equations are $$\hat{\beta}(t|t)=\hat{\beta}(t|t-1)+K(t)\epsilon(t)\quad (\mbox{State update})$$ $$R(t|t)=R(t|t-1)-K(t)x(t)R(t|t-1)\quad (\mbox{State Covariance update}),$$ where $K(t)$ is the Kalman gain given by $$K(t)=R(t|t-1)x(t)/Q(t).$$ To start the recursions, we assume $\hat{\beta}(1|0)=0$ and $R(0|0)=0.$ $V_{\omega}$ and $V_{\epsilon}$ need to be provided or estimated from data (Rajamani and Rawlings 2009). Following Montana we assume $V_{\omega}=\frac{\delta}{1-\delta}I$, where $\delta$ is between 0 and 1. If $\delta=0$ it becomes a OLS while with $\delta=1$ $\beta$ will fluctuate wildly. The optimal values can be obtained via training data, we pick $\delta=0.0001$ and $V_{\epsilon}=0.001.$<br /><br /><h4>Kalman Filter as market-making model</h4></div><div>We are concerned here with a single mean-reverting price series, intending to mind the mean price and standard deviation. This is a favorite model for the market makers to update their estimate of the mean price of an asset (Sinclair 2010). So mean price $m_t$ is the hidden variable and price $y_t$ is the observable variable. $$y_t = m_t + \epsilon_t \quad (\mbox{Measurement equation})$$ $$m_t=m_{t-1}+\omega_{t-1}\quad (\mbox{State equation})$$ $$m(t|t)=m(t|t-1)+k(t)(y(t)-m(t|t-1))\quad (\mbox{State update})$$ The variance of forecast is $$Q(t)=Var(m(t))+V_{\epsilon}$$ Kalman gain is $$K(t)=R(t|t-1)/(R(t|t-1)+V_{\epsilon}),$$ $$R(t|t)=(1-K(t))R(t|t-1)\quad (\mbox{State Variance update}).$$ To make these equations more practical, practitioners make further assumptions about the measurement error $V_{\epsilon}$ which measures the uncertainty in the observed transaction price. <i>If the trade size is large the uncertainty is small, and vice versa</i>. So $V_{\epsilon}$ becomes a time dependent function, specifically on trade size $T$ $$V_{\epsilon}=R(t|t-1)\left( \frac{T}{T_{max}}-1\right)$$<br />If $T=T_{max}$ there is no uncertainty and the Kalman gain is 1 and the mean estimate price is exactly equal to the observed price!$T_{max}$ can be some fraction of total trading volume of the previous day. This is similar to VWAP approach to determine mean price/fair value along with time weighted average price.<br /><br /><h4>The danger of data errors</h4></div><div>Particularly insidious on both backtesting and executing mean-reverting strategies. 'Outliers' inflate the backtest of a mean-reversion strategy (Thomas Falkenberry 2002). But they suppress the backtest performance of a momentum strategy. In live trading they produce wrong trades for both strategies. </div><div><br /></div><div><h2></h2><h2>Ch4 - Mean Reversion of Stocks and ETFs</h2></div><div>Stock from same sectors are good candidates for forming pairs, diversification is easy. Simple mean-reverting strategies actually work better for ETF pairs and triplets than stocks. In short term (seasonal), most stocks exhibit mean-reverting properties under normal circumstances (there isn't any news on stock). Over the long term stock prices follow geometric random walk. Index arbitrage is another familiar mean-reverting strategy, stock vs futures, stock vs ETFs. Profits have decreased so the strategy has to be modified. Cross-sectional mean reversion is prevalent in basket of stocks. The statistical tests for time series mean reversion are largely irrelevant for cross-sectional mean reversion. Due to huge attraction and ease of finding mean-reversion profits have decreased.<br /><br /><h4>The difficulties of trading stock pairs</h4></div><div>The daily frequency mean-reversion is a game of past. The intraday and seasonal mean-reverting properties are still exploitable. Out-of-sample cointegration is difficult to find. It is difficult to consistently make profits in mean reversion unless one has a fundamental understanding of each of the companies and can exit a position in time before bad news on one of them become public. Law of large numbers done not come to rescue (due to lack of independence) because the small profits gained by the 'good' pairs have been completely overwhelmed by the large losses of the pairs that have gone 'bad'. Further, there are short sale constraint resulting in <i>short squeeze</i>. The new alternative uptick rule also creates uncertainty in both backtesting and live trading. Once the circuit breaker is triggered, we are essentially forbidden to send short market orders. Since the profits have decreased it becomes imperative to enter and exit positions intraday to capture the best prices. Avoiding overnight positions also avoid changes in fundamental valuations that plague longer-term positions. The bid-ask spread and size has become very small due to prevalence of using dark pools, icebergs, high frequency trading and decimalization of US stock prices. So pairs traders who act as a type of market markers, find that their market-making profits have decreased as well. But other countries and US ETFs are still profitable.</div><div><br /></div><h4>Trading ETF Pairs (and Triplets) </h4><div>Once found to be conintegrating, ETF pairs are less likely to fall apart in out-of-sample data (vs stocks), because the fundamental economics of a basket changes more slowly than that of a single company (e.g. EWA-EWC Australian, Canadian ETF). We need to find ETFs that are exposed to common economic factors, e.g. country ETFs, sector ETFs (retail fund RTH vs consumer staples fund XLP). </div><div><br /></div><div>Another ETF pair is between commodity ETF and an ETF of companies that produce that commodity, e.g. GLD vs GDX. They have conintegrated till 2008, after which oil shock became a big part of mining expenses. Introducing USO as a triplet we find them cointegrated. Oil fund USO and energy sector fund XLE do not cointegrated because USO tracks the oil futures and not the spot oil. Mean reversion trading of such pairs would be much less risky if the commodity fund holds the actual commodity rather than the futures. </div><div><br /></div><h4>Intraday Mean Reversion: Buy-on-Gap Model</h4><div>Daily prices are indeed geometric random walks. There are many seasonal mean reversion occurring at the intraday time frames even for stocks.<br /><br />Select all stocks near the market open whose returns from their previous day's low to today's open are lower than 1 standard deviation based on daily close-to-close returns of past 90 days. These are the 'gapped down' stocks. Apply a <span style="color: red;">momentum filter</span> by requiring their open prices to be higher than the 20-day moving average of the closing prices. Buy the top 10 stocks in this list and liquidate the position at the end of the day. Similarly a short strategy can be constructed. The rational is that for an up-trending stock, if the stock is down before the open, panic selling will depress it further but it will appreciate over the course of the day. Usually, a stock that has dropped a little bit has a better chance of reversal than the one that has dropped a lot because the latter are often due to negative news, which are permanent and less likely to revert. The fact that a stock is higher than long-term moving average attracts selling pressure from larger players with longer horizons. This demand for liquidity at the open may exaggerate the downward pressure on the price, but liquidity driven moves are more likely to revert when such demand vanish. The long only strategy may present some risk management challenge and have low capacity.<br /><br />For realistic backtest one can use pre-open prices (e.g. at ARCA) to determine the trading signals. Also trading can't be ascertained at the open price. This induces signal noise. Intraday data can be used for more realistic numbers. Primary exchange prices should be used vs consolidated prices. Short sale strategies suffer the short sale constraint pitfall. This strategy is well known among traders and there are many variations on the same theme. A hedged version can be traded which is long the stocks but short the index futures. Sector restrictions can be applied. Buying period can be extended beyond the market open. Intraday profit caps can be imposed. The lesson is: price series do not exhibit mean reversion when sampled with daily bars but can exhibit strong mean reversion during specific periods. This is conditional seasonality at work at shorter time scale.<br /><br /><h4>Arbitrage between and ETF and its component stocks</h4></div><div>Index arbitrage trades on the difference in value between a portfolio of stocks constituting the index and the futures on that index. If the stocks are weighted same as index construction the cointegration is too tight to be exploited. Sophisticated traders can still profit by trading intraday, at high frequency. In order to increase these differences, we can select only a subset of the stocks in the index to from the portfolio. Same idea can be applied to ETF and its constituents. One selection method is to just pick all the stocks that cointegrate individually with the ETF with 90 percent probability using Johansen test. Then we form a portfolio of these stocks with equal weights. We <i>reconfirm</i> using the Johansen test that this long-only portfolio still cointegrates with the ETF (SPY e.g.). We are using log prices so the weights are capital on each stock, as we expect to rebalance it every day. After the cointegration is confirmed in-sample, we can backtest the linear mean reversion strategy. We can't test all the stocks and the index together via Johansen test because the test can take only a maximum number of symbols and would admit long-short positions which we may not want because that may double short some stocks increasing specific risks.<br /><br />Another method of constructing long-only portfolio is to first test each stock vs the index using Johansen test. This subset is then used via constrained optimization method (e.g. genetic algorithm or simulated annealing) to minimize the average absolute difference between this stock portfolio price and the index price series. The variable of optimization are the hedge ratios, with the constraint that all weights are positive. Short sale constraint is less harmful here as there is enough diversification.<br /><br /><h4>Cross-sectional mean reversion: A linear long-short model</h4></div><div>In this type of so-called "cross-sectional" mean reversion strategies, the individual stock price revert to their short-term relative returns. These, generally, don't work for futures and currencies. We rely on the serial anti-correlation of these relative returns to generate profits. We expect the under-performer to outperform and vice versa. We should not expect profits from each stock, as some may serve as hedge. Proposed by Khandani and Lo (2007) the weights are $$w_i=-\frac{(r_i-<r_j>)}{\sum_k |r_k-<r_j>|},$$ where $r_i$ is the daily return of the $i^{th}$ stock, $<r_j>$ is the average daily return of all the stocks in the index. We rebalance everyday to $1. Usually backtesting on a smaller cap universe will generate even higher returns.<br /><br />The return of this strategy can be enhanced by using the returns from the previous close to today's open to determine the weights for entry at the open. All the positions will be liquidated at the market close, thus turning it into an intraday strategy. This open-to-close strategy will have double the transaction cost and will have signal noise like the buy-on-gap model.<br /><br />There are possibly other factors that are better at predicting cross-sectional mean reversion of stock prices than the relative returns that we have used. One popular variable is price-earnings ratio from last quarter, or may be projected earnings estimated by the analysts or the companies themselves. If price moves are justified by fundamental mean reversion will not occur and we should avoid shorting such stocks if we use P/E ratio to rank the stocks.<br /><br /><h2></h2><h2>Ch5 - Mean Reversion of Currencies and Futures</h2></div><div>Most CTAs are momentum based. Most currency and futures pairs will not cointegrate and most portfolios of currencies or futures do not exhibit cross-sectional mean reversion. Mean reversion opportunities are limited but not non-existent, e.g. future calendar spreads and volatility future vs stock index future. Currency portfolio must be valued in same base currency and rollover interest must be accounted for.</div><div><br /></div><h4>Trading Currency cross-rates</h4><h4><span style="font-weight: normal;">Commodity currencies (Australian dollar, Canadian dollar, South African Rand, Norwegian Krone) may be cointegrated. </span> <span style="font-weight: normal;">Liquidity in currencies is higher compared to corresponding stock index ETFs. Higher leverage can be employed. There are no short-sale constraints. And it trades around the clock and we can employ stop losses in a meaningful way (If market is closed for a long period stop losses are useless as the market can gap up or down when it reopens). </span></h4><h4><ol><li><span style="font-weight: normal;">In the pair AUD.ZAR, AUD is the base currency and ZAR is the quote currency.</span></li><li><span style="font-weight: normal;">A quote of 5 for AUD.ZAR means it takes it takes 5 ZAR to buy 1 AUD. </span></li><li><span style="font-weight: normal;">Buying 100 AUD.ZAR means buying 100 AUD and selling 500 ZAR. </span></li><li><span style="font-weight: normal;">Very few brokers offer AUD or ZAR as a cross-rate. So we have to buy 100 USD.ZAR and sell 100 USD.AUD to effectively buy 100 AUD.ZAR</span></li><li><span style="font-weight: normal;">USD.ZAR/USD.AUD is the synthetic pair for AUD.ZAR</span></li></ol><div><span style="font-weight: normal;">The non-local currencies should be regularly converted to base USD to remove currency risk. In order to interpret the eigenvector from the Johansen test as capital weights, <span style="color: #38761d;">the two price series must have the same quote currency</span>. The trades can be appropriately places for the right order of currency pair. If we find cointegration between two entirely different cross-rates, care should be taken to calculate the returns correctly. The key step in backtesting currency arbitrage strategies is not the complexity of the strategies, but the right way to prepare the data series for cointegration tests, and the right formula to measure returns!</span></div><div><span style="font-weight: normal;"><br /></span></div><div><h4>Rollover interests in currency strategy</h4></div><div><span style="font-weight: normal;">A feature of trading currency cross-rate is the differential interest rate earned or paid if the cross-rate position is held overnight (till or beyond 5 PM ET). If we are long the currency pair B.Q the interest we earn is $i_B-i_Q$. When $i_Q>i_B$ we pay this interest and it is called the rollover interest. For $T+2$ settlement if T+3 is a holiday or weekend for either currency holiday are added to the interest, so anything past 5PM ET Wednesday will accrue excess rollover interest. For USD.CAD and USD.MXN it is $T+1$ settlement so anything past 5PM ET Thursday would accrue weekend interest as well. </span></div><div><span style="font-weight: normal;"><br /></span></div><div><span style="font-weight: normal;">For intraday positions rollover interest is zero. For trading a long-short dollar neutral equity portfolio, for futures position the financing cost is zero. In the case of currency cross-rates, we should add the rollover interest to the percentage change of the cross-rate, i.e. we need to modify the return calculation for cross rate strategies. </span><br /><span style="font-weight: normal;"><br /></span><br /><h4>Trading Futures Calendar Spread</h4></div><div><span style="font-weight: normal;">In reality, calendar spreads do not generally mean-revert. To understand why we need to understand the drivers of the returns of futures in general. Roll returns and spot returns constitute the total returns of a future. An ETF of commodity producers (XLE) may cointegrate with the spot prices but not with futures prices because of the presence of the roll returns. If we assume that spot and roll returns are truly constant throughout time ($F(t,T)=Ce^{\alpha t}e^{\gamma (t-T)}$), we can use linear regression to estimate their values. Spot returns can be directly regressed on time, but to find the roll return we need to regress the price vs the time to maturity. This will be different for different maturities and hence $\gamma$ will vary with time. In general, average roll returns are much larger in magnitude than spot returns.</span><br /><span style="font-weight: normal;"><br /></span><span style="font-weight: normal;">Volatility, and hence the VIX index, is mean reverting. But the futures VX are not, they just inexorably decline, all due to roll return. This roll returns has been mostly negative. If we define the spread of log prices of the two legs and maintain the market value of the two legs to be the same at every period, it turns out to be $\gamma(T_1-T_2)$, which is simply roll returns and independent of the spot price. Hence, spreads are returns but only roll returns component. This log spread series mean-reverts. </span><br /><span style="font-weight: normal;"><br /></span><br /><h4>Do calendar spreads mean-revert?</h4></div><div><span style="font-weight: normal;">We may expect the calendar spread components to be cointegrated and hence mean-revert, but in reality roll returns derail our intuition. The difference of log prices of the two legs, maintaining the market value of the two legs to be same at every period, is simply $\gamma(T_1-T_2)$, which is simply roll returns and independent of the spot. Hence, we are considering on the roll returns part of the total returns. The log spread series indeed mean-reverts. We apply the strategy of holding period for a pair of 61 trading days, roll 10 days before expiration, and the contracts are 1 year apart. We get an IR of 1.3 for CL. </span><br /><span style="font-weight: normal;"><br /></span><span style="font-weight: normal;">Seasonality is often a prominent feature for commodities. For a particular market, only calendar spreads of certain months (and certain months apart) mean-revert. Same reversion of spreads can be applied to VIX calendar spreads, but do not work! However the ratio back/front for VX is mean reverting (verified by ADF test), but only from 2008 when a regime change happened. This gives an IR of 1.5 from oct 2008 to Apr 2012.</span><br /><span style="font-weight: normal;"><br /></span><br /><h4>Futures Inter-market Spreads</h4></div><div><span style="font-weight: normal;">It is almost impossible to find futures with different underlyings that are mean reverting. The Prices need to be synchronous for the mean reversion to be tested (taking care of multipliers). The possible candidates are </span></div><div><ul><li><span style="font-weight: normal;">Crack spread - the 3:2:1 ratio does not have a mean reverting behavior, fails ADF</span></li><li><span style="font-weight: normal;">CL:BZ - fails ADF. BZ has outperformed CL due to increase in production in US, pipeline bottleneck at Cushing and geopolitical concerns like Iranian embargo, which effected Europe and hence BZ more than US.</span></li><li><span style="font-weight: normal;">basket of CL, BZ, RB and HO</span></li></ul><h4>Volatility Futures versus Equity Index Futures</h4></div><div><span style="font-weight: normal;">Volatility is anti-correlated with the stock equity market index: When the market goes down, volatility shoots up, and to a lesser extent, vice versa. There appears to be two regimes in a plot of VX vs ES futures - before 2008 and after 2008 (low vol but with greater range!). Applying liner regression or apply the Johansen test to a mixture of both regimes would be not correct. Hence, we can apply Engle-Granger process post 2008 after multiplying the multiple. The IR is 1.4, for two year 2010 to 2012. There is also a VX-ES momentum strategy discussed in next chapter.</span><br /><span style="font-weight: normal;"><br /></span><br /><h2>Ch6 - Interday Momentum Strategies</h2></div><div><span style="font-weight: normal;">Causes of momentum - persistence of roll returns, particularly the sign (futures), slow diffusion of news, forced sales or purchase by funds, market manipulation by high frequency traders. There is a time-series and cross-sectional version. Interday momentum suffers from a recently discovered weakness, which intraday momentum are less affected by.</span><br /><span style="font-weight: normal;"><br /></span><br /><h4>Tests for Time series Momentum</h4></div><div><span style="font-weight: normal;">Time series momentum means that past returns are positively correlated with the future returns. Correlation between past n day and future m days returns can be tested. Or else the correlation of signs could also be checked. For long term trends we can check the Hurst exponent or the variance ratio test to rule out random walk hypothesis. We present numbers for TU, two year treasury.</span><br /><span style="font-weight: normal;"><br /></span><span style="font-weight: normal;">In computing the correlations of pairs or returns, we must take care not to use overlapping data. If look-back is greater than the holding period, we have to shift forward by the holding period to generate a new returns pair. If the holding period is greater than the look-back, we have to shift forward by the look-back period. With a one day shift the t-stats would be artificially high, but the correlations would still be correct. Hence, to estimate the right p-value non-overlapping windows are essential. A look-back of 60 and 250 days with a holding of 10 to 25 days. The Hurst exponent is 0.44 and the variance ratio test is rejected, i.e. it is a random walk - the time series exhibits momentum and mean reversion at different time frames. </span><br /><span style="font-weight: normal;"><br /></span><br /><h4>Time Series Strategies</h4><span style="font-weight: normal;">Paper by Moskowitz, Yao and Pedersen 2012 present momentum with 12 month holding period and holding for 1 month. This can be rolled over every day with 1/25 th fraction invested each day. For TU the IR is 1.0 with very low margin. </span><br /><span style="font-weight: normal;"><br /></span><span style="font-weight: normal;">Why do many futures returns exhibit serial correlations and why do they occur only at a fairly long time scale? The explanation lies in the roll returns. Typically a future stays in contango or backwardation over long period of time. The spot return however can vary rapidly. So, in longer, terms if roll returns dominate the spot returns we will get serial correlation (Corn is an exception). Hence, if we use lagged roll returns as a signal it might be cleaner. Applying this makes the IR of TU 2.1 with reduced drawdowns as well!</span><br /><br /><span style="font-weight: normal;">Other possible entry signals can be - buy when prices reach N-day high, when prices exceed N-day moving average or exponential moving average, when the prices exceeds the upper Bollinger band, when the number of up days exceeds the number of down days in a moving period. Alexander Filter - buy when the daily returns moves up at least x percentage, and then sell and go short if the prices moves down at least x percentage from a subsequent high. </span><br /><span style="font-weight: normal;"><br /></span><span style="font-weight: normal;">Sometimes the combination of mean-reverting and momentum strategy may work better. One example strategy on CL - But at the market close if the price is lower than that of 30 days ago and is higher than that of 40 days ago; vice versa for shorts. There are are Mutual funds selling diversified momentum indicators. The true test always is true out-of-sample testing.</span><br /><span style="font-weight: normal;"><br /></span><br /><h4>Extracting Roll Returns through Futures versus ETF arbitrage</h4></div><div><span style="font-weight: normal;">If contango, but the underlying and short the future; and vice versa if backwardation. This arbitrage strategy is likely to result in a shorter holding period and a lower risk, since in the previous strategy we needed to hold the future for a long time before the noisy spot return can be averaged out.</span><br /><span style="font-weight: normal;"><br /></span><span style="font-weight: normal;">The logistics of buying and especially shorting the underlying asset is not simple. But ETF for many precious metals can be found. But in contrast to owning futures, owning ETF (e.g. GLD) acutally incurs financing cost, which generally eats up the roll returns. ETFs and Futures settle at different times, so the asynchronicity is a pitfall. </span><br /><span style="font-weight: normal;"><br /></span><span style="font-weight: normal;">Outside precious metals it is difficult to find ETFs that hold underlying commodities. But ETFs containing commodity producing companies often cointegrate with the spot price of those commodities. We can use these ETFs as a proxy for the spot prices. For example arbitrage between the energy sector ETF XLE and ETF USO (the WTI crude oil futures CL has different closing time). Short USO and long XLE whenever the CL is in contango, and vice versa for backwardation. The IR is 1.0 from 2006 to 2012.</span><br /><span style="font-weight: normal;"><br /></span><span style="font-weight: normal;">VX does not have an underlying trading commodity, but a basket of options, which is very hard to replicate. But we can find an index highly correlated or anti-correlated with the spot returns. In case of VIX, the familiar ETF SPY fits the bill, because it has insignificant roll returns. </span><br /><span style="font-weight: normal;"><br /></span><br /><h4>Volatility futures versus equity index futures: redux</h4><span style="font-weight: normal;">VX is highly anti-correlated with ES. We can use large roll return magnitude of VX, the small roll return magnitude of ES to develop a momentum strategy. If the price of the front contract of VX is higher than that of VIX by 0.1 point (contango) times the number of trading days until settlement, short 0.4 front contracts of VX and short 1 front contract of ES, holding for a day. Vice versa for backwardation. VX forward price don not fall on a straight line, so the curve can't be used to estimate the roll returns like for other commodities. The hedge ratio is based on the regression fit between the VX versus ES prices (not between returns!). This gives a Sharpe of 1 from 2010 to 2012.</span><br /><span style="font-weight: normal;"><br /></span><br /><h3>Cross - sectional strategies</h3></div><div><span style="font-weight: normal;">If we believe that commodities' spot prices are positively correlated with economic growth or some other macroeconomic indices, we can just buy a portfolio of futures in backwardation, and simultaneously short a portfolio of futures in contango, leaving us with favorable roll returns. Daniel and Moskowitz 2011 described the cross-sectional momentum with longer holding period. Ranking based on 12-month returns and holding for 1 month the long short portfolio give good performance but not during the great financial crisis. The same strategy works for universe of world stocks, currencies, international stocks and US socks. </span><br /><span style="font-weight: normal;"><br /></span><span style="font-weight: normal;">We can rank the stocks by many other factors as well (except lagged returns). Total returns can be decomposed into spot return and roll return. Similar it can be decomposed to market return and factor return. A cross-sectional portfolio will eliminate the market component. We can rank based on fundamentals like earnings growth, book to price ratio or some linear combination thereof. Or it could be statistical factors like PCA. All these factors except PCA change very slowly resulting in long holding periods. For futures we could use GDP growth, inflation rate or PCA. </span><br /><span style="font-weight: normal;"><br /></span><br /><h3>News Sentiment as fundamental factor</h3></div><div><span style="font-weight: normal;">With machine-readable news it is now possible to programmatically capture all news items, not just earning and merger and acquisition activities. Sentiment score can be applied based on price impact of the article. Aggregation of these sentiment scores from multiple news articles was found to be predictive of its future return. Hafez and Xie (2012) give a sorting based on RavenPack's sentiment score with IR of 5.3 before cost. This also demonstrate slow diffusion of news is the cause of stock momentum. Other vendors are Recorded Futures, thestocksonar.com, Thomas Reuters News Analytics. Newsware offers a low-cost version of news feeds. Bloomber Event-Driven Trading, Dow Jones Elementized News Feed, and Thomas Reuters Machine readable News are the lower latency and better coverage options. </span><br /><span style="font-weight: normal;"><br /></span><span style="font-weight: normal;">The general mood of the society using 'Twitter' feeds is predictive of market index itself (Bollen, Mao, Zeng 2010).</span><br /><span style="font-weight: normal;"><br /></span><br /><h3>Mutual Funds Asset Fire Sale and Forced Purchases</h3></div><div><span style="font-weight: normal;">Coval and Stafford (2007) found that mutual funds experiencing large redemption are likely to reduce or eliminate their existing stock positions. This is not surprising as mutual funds are mostly fully invested with no cash reserves. Funds experiencing large capital inflows increase their existing positions rater than investing in newer ideas. The 'fire sale' by poor performing mutual funds cause the stocks to experience negative returns and is contagious and causes further redemption by other funds. The same situation occurs in reverse for stocks held by superbly performing mutual funds with large capital inflows. This order flow based momentum is applicable at all time scales.</span><br /><span style="font-weight: normal;"><br /></span><span style="font-weight: normal;">A factor can be constructed to measure the selling (buying) pressure on a stock based on the net percentage of funds holding them that experienced redemption (inflows). This can be defined as </span><span style="font-weight: normal;">$$Pressure(i,t)=\frac{\sum_j \mathcal{I}(Buy_{i,j,t}|flow_{j,t}>5\%) - \sum_j\mathcal{I}(Sell_{i,j,t}|flow_{j,t}<-5\%)}{\sum_j \mathcal{I}(1_{j,i,t-1})}$$ Weighing Buy by NAV may give better results. Coval and Stafford (with quarterly updates) found that the market-neutral portfolio formed based on shorting stocks with highest selling pressure and buying stocks with highest buying pressure generates annualized returns of about 17 percent before transaction costs. </span><br /><span style="font-weight: normal;"><br /></span><span style="font-weight: normal;">Furthermore, capital flows into and out of mutual funds can be predicted with good accuracy based on their past performance and capital flows (herd like behavior of retail investors). Based on these predictions we can predict the future values of the pressure factors noted above, i.e. we can front-run the mutual funds in our selling of the stocks that are currently owned. This front running strategy generates another 17 percent annualized before transaction cost. </span><br /><span style="font-weight: normal;"><br /></span><span style="font-weight: normal;">Finally, since these stocks experience such selling and buying pressure due to liquidity-driven reasons, and suffer suppression or elevation of their prices (not because of fundamental reason) they often mean-revert after the pressure is over. Indeed, buying stocks that experienced the most selling pressure in the t-4 up to t-1 quarters, and vice versa, generates another 7 percent annualized returns. </span><br /><span style="font-weight: normal;"><br /></span><span style="font-weight: normal;">Combining all three strategies (momentum, front running, and mean reverting) generates a total return of about 41 percent before transaction costs. The slippage might be significant because of delay in mutual fund holding data at the end of the quarter. The data from Center for research in security prices (CRSP) costs 10K an year. </span><br /><span style="font-weight: normal;"><br /></span><span style="font-weight: normal;">Apart from mutual funds, index funds and levered ETFs ignite similar momentum as well. In fact, forced asset sales and purchases by hedge funds can also lead to momentum in stocks, as in the August 2007 quant funds meltdown.</span><br /><span style="font-weight: normal;"><br /></span><br /><h3>Pros and Cons of momentum strategies</h3></div><div><span style="font-weight: normal;">Momentum strategies are diametrically opposite to mean-reverting strategies. Starting with the cons, it is harder to create profitable momentum strategies and they tend to have lower Sharpe than mean-reverting strategies because</span><br /><br /><ol><li><span style="font-weight: normal;">Have long look-back periods, so the number of independent trading signals is few and far in between leading to lower Sharpe. </span></li><li><span style="font-weight: normal;">Momentum crashes - these strategies perform miserably for several years after crashes. During this period momentum is replaced by mean reversion. Momentum crashes are caused by strong rebound of short positions following market crisis.</span></li><li><span style="font-weight: normal;">The duration over which news-driven momentum remain in force gets progressively shorter as more traders catch on to it. This constantly shortening of holding period has no predictable schedule.</span></li></ol><div><span style="font-weight: normal;">Looking at the pros:</span></div><div><ol><li><span style="font-weight: normal;">Ease of risk management - there are two common type of exit strategies for momentum: time-based and stop-loss. Stop-losses are consistent with momentum strategies. If momentum has changed direction we should enter the opposite position and hence this change of momentum serves as a natural stop-loss. Stop-losses are not consistent with mean-reversion. Hence, momentum losses are always limited. </span></li><li><span style="font-weight: normal;">Momentum strategies can thrive in risky environment as well. For mean-reverting strategies the upside is limited by the mean they will revert to but the downside can be unlimited. For momentum strategies, their upside is unlimited, while their downside is limited. The more often 'black swan' event occurs, the more likely that a momentum strategy will benefit from them. The thicker the tails of the return distribution curve, or higher the kurtosis, the better that market is for momentum strategies. </span></li><li><span style="font-weight: normal;">Most futures and currencies exhibit momentum, allowing us to truly diversify the risk across different asset-classes and countries. </span></li></ol><div><span style="font-weight: normal;"><br /></span></div></div><div></div></div></h4><h2>Ch7 - Intraday Momentum Strategies</h2><div>Time series momentum is typically long - month or longer, resulting in lower Sharpe and lower statistical significance due to infrequent independent trading signals. They also suffer from under performance after crashes. Short term intraday strategies do not suffer from these drawbacks. Apart from roll returns reason the other three reasons for momentum also operate at intraday time frame. An additional reason for intraday momentum is 'triggering of stops', causing breakout strategies. </div><div><br /></div><div>Intraday momentum can be triggered by specific events beyond just price actions like corporate news of earning announcements, analyst recommendation changes, or macro-economic news. Intraday momentum can also be triggered by actions of large funds, e.g. daily rebalancing of leveraged ETFs leads to short-term momentum. Finally, the imbalance of bid and ask sizes, the changes in order flows, or nonuniform distribution of stop orders can all induce momentum in prices. </div><div><br /></div><h3>Opening Gap strategy</h3><div>Buying when the instrument gaps up, and shorting when it gaps down. Works best for Dow Jones STOXX 50. This produces an IR of 1.4 from 2004 to 2012. For currencies the daily "open" and "close" need to be defined differently, close to 5 PM ET and open to 5 AM ET (London open). The same strategy for GBPUSD has an IR of 1.3 from 2007-2012. Overnight or weekend gap trigger momentum because they accumulate un-acted information. The execution of the stop orders often lead to momentum because a cascade effect may trigger stop orders placed further away from the open price as well. Alternatively, there may be significant events that occurred over-night.<br /><br /><h3>News driven momentum strategy</h3></div><div>Slow diffusion of news makes the momentum at few days, hours, or seconds after post-earnings, and other corporate and macroeconomic news. Post earning announcement drift still exist but the duration has reduced. As recent as 2011, if we enter the market open after earning announcement was made after previous close, buying back the stock if the returns are very positive and shorting if the returns are very negative, and liquidate the position at the day's close we can make good returns. Earning.com has such data. We can use 90 day moving standard deviation of previous-close-to-next day's open return as the benchmark for deciding whether the announcement is 'surprising' enough to generate the post announcement drift. For a universe of S&P 500 stocks an IR of 1.5 is available from 2011-2012. This can be levered up 4 times as it is an intraday strategy. Holding the positions overnight is not rewarding, the returns overnight are negative. 10-20 years ago PEAD lasted 1-2 days, more recently the momentum has shortened. </div><div><br /></div><h3>Drift due to other events</h3><div>Earning guidance, analyst ratings and recommendation changes, same store sales, airline load factors (provided by Dow Jones Newswire delivered by Newsware which is machine readable). See Hafez 2011 for a comprehensive list. Merger and acquisitions can also deliver news momentum kind of strategies. It is interesting to note that acquiree's stock price falls more than the acquirer's after the initial announcement of the acquisition.<br /><br />Index composition changes generate buying and selling and create momentum. When a stock is added to an index there is buying pressure immediately after the announced changes. These drift horizons have changed from days to intraday.<br /><br />The impact of macroeconomic events such as Federal Open Market Committee's rate decisions or the release of latest consumer price index do not produce any significant momentum on EURUSD. Clare and Courtenay 2001 report that UK macroeconomic data releases and Bank of England interest rate announcements induced momentum in GBPUSD for up to 10 minutes.<br /><br /><h3>Levered ETF Strategy</h3></div><div>The constant leverage requirement has some counter-intuitive consequences. If there is a big drop one would need to substantially reduce the positions in the levered portfolio to keep the leverage constant. and vise versa as a holder of the leveraged ETF. These rebalancing happen at market close and produce momentum. As a strategy buy DRN (real estate ETF) if return from previous day's close to 15 minutes before market close is greater than 2 percent, and sell if the returns are smaller than -2 percent. Exit at market close. This gives and IR of 1.8 from 2011-2012.<br /><br />As the aggregate assets of these ETF increase the returns of the strategy increase. The total AUM of levered ETFs is 19 billion Cheng and Madhavan 2009 which can create a big order at close. Rodier, Haryanto, Shum and Hejazi 2012 have updated this analysis.<br /><br />The flow of investor's cash also effect the momentum. A large inflow will cause positive momentum on the underlying's price. A large inflow into short leveraged ETF will cause negative momentum.<br /><br /><h3>High frequency strategies</h3></div><div>Most of them extract information from the order book. e.g. if bid size is much bigger than the ask size, expect the price to tick up and vice versa (Maslov and Mills 2001). The effect is stronger for lower volume stocks. Books on microstructure (Arnuk and Saluzzi 2012, Durbin 2010, Harris 2003, Sinclair 2010) describe a lot of hig-frequency momentum strategy. 'Ratio trades' can be used for momentum profits in markets that fill orders on a pro-rata basis such as eurodollar futures on CME. 'Ticking or quote matching' can be used when the bid-ask spread is bigger than two ticks, and there is expectation of an uptick.<br /><br />'Momentum ignition' is to create an illusion of buying pressure (or vice versa). This works for market with time priority for orders. 'Flipping' can be used to generate artificial imbalance. Private data feed from exchanges like ITCH from Nasdaq, EDGX from Direct edge, PITCH from BATS can be used to detect flippers.<br /><br />These strategies and defenses show that high-frequency traders can profit from slower traders only. Due to this the quote sizes have decreased and large orders are broken into smaller orders. 'Stop hunting' strategies exploit the short-term momentum when the resistance is breached. These resistance levels are either reported daily by banks or just be round numbers in the proximity of the current price levels. This is because there are a large number of stop orders placed at or near the support and resistance level.<br /><br />Order flow information is good predictor of price movements, because market makers can distill important fundamental information from order flow information, and set bid-ask accordingly. The urgency of using market orders indicates that the information is new and not widely known. For Stocks and futures we can monitor and record every tick and determine whether a transaction took place at bid or ask. We can then compute the cumulative or average order flow over some look-back period and use that to predict whether the price will move up or down.<br /><br /><h2>Ch8 - Risk Management</h2></div><div>Risk aversion - an average human being needs to have the potential for making 2 to compensate for the risk of losing 1 (which is why Sharpe of 2 is so appealing Kahneman 2011). This dislike for risk is not rational. The goal should be maximizing long term equity growth. The key concept is the prudent use of leverage, which can be optimized using Kelly formula or some numerical methods that maximize compounded growth rate. In short term draw-down control is much more important, which can be limited by stop-losses, but it is problematic. The other way is constant proportion portfolio insurance, which tries to maximize the upside of the account in addition to preventing large drawdowns. Finally, stopping trading during high risk of loss can be used using leading indicators of risk as an effective loss-avoidance technique.</div><div><br /></div><div><h3>Optimal Leverage</h3></div><div>For managing own money, where maximizing net worth over long term is important and short-term draw-downs and volatility of returns are not important </div><div><br /></div><div>Kelly Formula</div><div><br /></div><div>Optimization of Expected Growth Rate using simulated returns</div><div><br /></div><div>Optimization of historical growth rate</div><div><br /></div><div>Maximum drawdown</div><div><br /></div><div>Constant Proportion portfolio Insurance</div><div><br /></div><div>Stop Loss</div><div><br /></div><div>Risk Indicators</div>Mhttp://www.blogger.com/profile/04296475349435748706noreply@blogger.com0tag:blogger.com,1999:blog-3299057174197331019.post-83347595393325863452015-07-28T18:23:00.000-07:002015-08-16T07:41:35.736-07:00Momentum: Jagadeesh and Titman 2001<script type="text/x-mathjax-config"> MathJax.Hub.Config({tex2jax: {inlineMath: [['$','$'], ['\\(','\\)']]}}); </script> <script src="http://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML" type="text/javascript"> </script>3-12 month returns and earning momentum is consistently profitable. The best performer are no more riskier than worst performers. Hence, standard risk adjustments tend to increase the return spread between the winner and losers.<br /><br />The cause is overreaction or underreaction to information. There is reversal over weeks to months and years and 5 years, while momentum at 3-12 months. There is seasonality in January with negative returns and positive for every other month.Mhttp://www.blogger.com/profile/04296475349435748706noreply@blogger.com0tag:blogger.com,1999:blog-3299057174197331019.post-29668224976251356082015-07-28T16:55:00.000-07:002015-08-16T07:41:28.375-07:00A New Anomaly: The Cross-Sectional Profitability of Technical Analysis - Han, Yang, Zhou 2013<script type="text/x-mathjax-config"> MathJax.Hub.Config({tex2jax: {inlineMath: [['$','$'], ['\\(','\\)']]}}); </script> <script src="http://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML" type="text/javascript"> </script>Momentum portfolio sorted by volatility generates better profits than well-known return based momentum strategies. The correlations are low as well. These excess returns are not explained by market timing, investor sentiment, default and liquidity risk. Similar results hold if the portfolios are sorted based on other <i>proxies of information uncertainty (size, distance to default, credit rating, analyst forecast dispersion, earnings volatility)</i>. The more noise-to-signal ratio or the more uncertain the information, the more profitable the technical analysis.<br /><br />Strategy: Buy or remain long the portfolio today when yesterday's price is above its 10-day MA price, to to invest in risk-free asset otherwise. This is compared against buy-and-hold, for the top decile.<br /><br /><br />Mhttp://www.blogger.com/profile/04296475349435748706noreply@blogger.com0tag:blogger.com,1999:blog-3299057174197331019.post-62180369317361861492015-07-27T18:09:00.001-07:002015-08-16T07:41:22.525-07:00Momentum and Autocorrelation in Stock Returns - Lewellen<script type="text/x-mathjax-config"> MathJax.Hub.Config({tex2jax: {inlineMath: [['$','$'], ['\\(','\\)']]}}); </script> <script src="http://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML" type="text/javascript"> </script>Role of size and BM factors on stock momentum. Both are negatively auto-correlated and cross-serially correlated over intermediate horizons. The excess covariance of stocks with each other, and not under-reaction, explains momentum in the portfolios.<br /><br />Firm specific returns and investors under-reaction and belated overreaction does not explain a significant component of momentum. Size and BM factor based momentum is strong and distinct, showing that momentum can't be attributed solely to firm-specific returns - there must be multiple sources of momentum. Momentum shows up in individual stocks and size quintiles, but vanishes at the market level.<br /><h4>Sources of Momentum</h4><div>Profits depend on both auto-correlations and the lead-lag relationship. The portfolio weight of asset $i$ in month $t$ is</div><div>$$w_{i,t}=\frac{1}{N}(r_{i,t-1}-r_{m,t-1})$$</div><div>where $r_{m,t}$ is the equal-weighted market index returns in month $t$. Assume returns have unconditional mean $\mu=E[r_t]$ and autocovariance matrix $\Omega=E[(r_{t-1}-\mu)(r_t-\mu)^T]$. The portfolio return in month t equals:<br />$$\pi_t=\sum_i w_{i,t}r_{i,t}=\frac{1}{N}\sum_i (r_{i,t-1}-r_{m,t-1})r_{i,t}.$$<br />Hence, the expected profit is<br />$$E[\pi_t] = \frac{1}{N}E\Bigg[\sum_i r_{i,t-1}r_{i,t}\Bigg]-\frac{1}{N}E\Bigg[r_{m,t-1}\sum_i r_{i,t}\Bigg] \<br /> = \frac{1}{N} \sum_i (\rho_i+\mu_i^2)-(\rho_m+\mu_m^2),$$<br />where $\rho_i$ and $\rho_m$ are the autocovariances of the asset i and the equal-weighted index, respectively. Using that fact that average autocovariance equals $tr(\Omega)/N$ and the autocovariance of the market portfolio equals $\varsigma^T\Omega\varsigma/n^2$, where $\varsigma$ is the vector of ones.<br />$$E[\pi_t]=\frac{1}{N}tr(\Omega)-\frac{1}{N^2}\varsigma^T\Omega\varsigma+\sigma_{\mu}^2=\frac{N-1}{N^2}tr(\Omega)-\frac{1}{N^2}[\varsigma^T\Omega\varsigma-tr(\Omega)]+\sigma_{\mu}^2.$$<br />This decomposition says that momentum can arise in three ways:<br />1) stocks might be positively autocorrelated (first term) - meaning stocks with high returns today are expected to have higher returns tomorrow.<br />2) Cross-serial correlations might be negative - meaning firm with high return today predicts that other firms will have low returns in the future. This is related to excess covariance among stocks.<br />3) High unconditional mean stocks.<br /><br />This decomposition is not unique. </div>Mhttp://www.blogger.com/profile/04296475349435748706noreply@blogger.com0tag:blogger.com,1999:blog-3299057174197331019.post-32907453284132785112015-07-25T19:33:00.000-07:002015-09-07T18:20:18.008-07:00Anticipating Correlations - Engle<script type="text/x-mathjax-config"> MathJax.Hub.Config({tex2jax: {inlineMath: [['$','$'], ['\\(','\\)']]}}); </script> <script src="http://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML" type="text/javascript"> </script>These are my notes on Robert Engle's book 'Anticipating Correlations - a new paradigm for risk management'. Engle is a celebrated Nobel Laureate for his contributions to the development of GARCH model of volatility.<br /><h2>Ch1: Correlation Economics </h2><div>The movement in the prices of assets are not independent. If they were it would have been possible to construct a portfolio with negligible volatility. Estimating the correlations for big cross-section is a Herculean task, especially when it is recognized that these correlations var over time. Hence, a forward looking correlation estimation is needed for optimal risk-management, portfolio selection and hedging. The main method developed is dynamic conditional correlations (DCC).<br /><br />There are high correlations between industry sector stocks but lower otherwise. The correlation between different asset classes is lower. For equity of different countries the data should be non-synced (e.g. by taking average over more than one days) before taking correlations.<br /><br />Changes in asset prices and correlations reflect changing forecasts of future payments. The effect of a news affects all asset prices to a greater or lesser extent, depending on their correlations. The most important reason why these correlations change over time is because the firms change their line of business. A second important factor is the characteristic of the news change (e.g. change in magnitude of the news). </div>Mhttp://www.blogger.com/profile/04296475349435748706noreply@blogger.com0tag:blogger.com,1999:blog-3299057174197331019.post-5730219657948855382015-07-18T07:42:00.002-07:002015-08-16T07:40:59.209-07:00Modeling return dynamics via decomposition<script type="text/x-mathjax-config"> MathJax.Hub.Config({tex2jax: {inlineMath: [['$','$'], ['\\(','\\)']]}}); </script> <script src="http://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML" type="text/javascript"> </script>The paper Modeling Financial Returns Dynamics via Decomposition - Anatolyev and Gospodinov (2010) points out that predicting excess returns for stocks is much more difficult than simply predicting the direction of change. They hence decompose returns into direction and magnitude change and jointly model them for sign and magnitude using a copula for interaction. This lets them incorporate important non-linearities. <br /><h4>Introduction</h4><div>Valuation ratios (dividend price, earning price), yields on short and long term treasury and corporate bonds appear to posses some predictive power at short horizons for timing the market. New variable with incremental predictive power such as share of equity issue in total new equity and debt issues, consumption-wealth ratio, relative valuations of high and low-beta stocks, estimated factors from large economic datasets can be use (<span style="color: red;">Lettau and Ludvigson 2008 review paper</span>). </div><br /><span style="color: red;">This paper, instead of trying to identify better predictors, look for better ways of using predictors. </span>This is done by decomposing returns into sign and magnitude. Sign has better predictability. We aim to predict the expected returns and the following <b>decomposition model</b> is proposed:<br />$$E[r_t|F_{t-1}]=E[|r_t|sign(r_t)|F_{t-1}]=f(|r_t|)\times g(sign(r_t))\times \text{interaction copula}.$$<br />The magnitude is modeled using multiplicative error model, the sign by dynamic binary choice model and a copula for their interaction. This way we are able to model hidden nonlinearities absent from the regression setup. Magnitude and signs have substantial dependence over time but hardly any for returns! e.g. magnitude is like vol which shows significant dependence. One important aspect of the bivariate analysis is that, in spite of a large unconditional correlation between the multiplicative components, they appear to conditionally very weakly dependent.<br /><br />This opens avenues for strategies as well (Anatolyev and Gerko 2005). The decomposition model is better than predictive regression which is better than buy-and-hold strategy - both in and out of sample. The decomposition model also produces unbiased forecasts.<br /><h3>Methodological Framework</h3><div>The key identity is</div><div>$$r_t=c+|r_t-c|sign(r_t-c)=c+|r_t-c|(2\mathbb{I}[r_t>c]-1)$$<br />and hence,<br />$$E[r_t|F_{t-1}]=c-E[|r_t-c| | F_{t-1}]+2E[|r_t-c|\mathbb{I}[r_t>c]|F_{t-1}],$$<br />where $c$ is a user defined constant used to model transaction cost, different dynamics of small or large positive and large negative returns. It would be 0 for modeling recession and expansion using GDP. 3% for modeling output gap, and 2% for forecasting inflation. $F_{t-1}$ is all the information available till time $t-1$, which practically consists of all data like lagged returns, volatility, volume and other predictive variables available at time $t-1$. Toy example, where predictive variables are based on realized volatility $RV_{t-1}$:<br />a) For <u>direct regression model</u>: $E[r_t]=\alpha+\beta RV_{t-1}$ gives a $R^2$ of 0.39%<br />b) For <u>decomposition model</u>: $E[|r_t|]=\alpha_{|r|}+\beta_{|r|}RV_{t-1}$ and $Pr[r_t>0]=\alpha_{\mathbb{I}}+\beta_{\mathbb{I}}RV_{t-1}$. Assuming the two components are stochastically independent giving $E[r_t]=\alpha_r+\beta_rRV_{t-1}+\gamma_r RV^2_{t-1}$, showing that nonlienarities are covered in the decomposition model, giving $R^2$ of 0.72%.<br />c) Further adding $\mathbb{I}[r_{t-1}>0]$ and $RV_{t-1}\mathbb{I}[r_{t-1}>0]$ to the regressor list increases the $R^2$ to 1.21%.<br /><br />It is important to note that it is the augmentation of the sign component which delivers nonlinear dependence, improving the prediction. The driving force behind the predictive ability of the decomposition model is the predictability in the two components. The interaction term is less significant. This is the main theme of this work.<br /><h4>Marginal distributions and Copula model</h4></div><div><b>a)</b> <b><i>Volatility model</i></b>: Absolute returns $|r_t-c|$ is a positively valued variable and is modeled using multiplicative error framework of Engle (2002) $$|r_t-c|=\psi_t\eta_t,$$ where $\psi_t=E[|r_t-c||F_{t-1}]$ and $\eta_t$ is a positive multiplicative error with $E[\eta_t|F_{t-1}]=1$ and conditional distribution $\mathbb{D}$. $\psi_t$ can be modeled using lograthimic autoregressive conditional duration (LACD) as<br />$$ln\psi_t=\omega_v+\beta_vln\psi_{t-1}+\gamma_vln|r_{t-1}-c|+\rho_v\mathbb{I}[r_{t-1}>c]+\pmb{x}^T_{t-1}\pmb{\delta}_v.$$ The second last term allows for regime-specific volatility dependence while the last term represents macroeconomic predictors of volatility. $\mathbb{D}$ can be modeled as constant parameter Weibull distribution (or others distributions with the shape parameter vector $\varsigma$ a function of the past).<br /><div><br /></div><b>b) <i>Direction model</i></b>: The indicator $\mathbb{I}[r_t>c]$ has a conditional distribution of Bernoulli $\mathbb{B}(p_t)$ with probability mass function $f_{\mathbb{I}[r_t>c]}(v)=p^v_t(1-p_t)^{1-v}, v\in {0,1}$, where $p_t$ denotes the conditional 'success' probability $Pr(r_t>c|F_{t-1})=E[\mathbb{I}[r_t>c]|F_{t-1}]$. Christoffersen and Diebold (2006) show a remarkable result that if data are generated by $r_t=\mu_t+\sigma_t\epsilon_t$, where $\mu_t=E[r_t|F_{t-1}]$, $\sigma^2_t=Var[r_t|F_{t-1}]$, and $\epsilon_t$ is a homoskedastic martingale difference with unit variance (i.e. can be modeled as GARCH process) and distribution function $\mathbb{F}_{\varepsilon}$, then<br />$$Pr[r_t>c|F_{t-1}]=1-\mathbb{F}_{\epsilon}\left(\frac{c-\mu_t}{\sigma_t}\right).$$<br />This suggests that time-varying volatility can generate sign predictability as long as $c-\mu_t\ne0$. Furthermore Christoffrsen (2007) derive a Gram-Charlier expansion of this distribution and show that $Pr[r_t>c|F_{t-1}]$ depend on the third and fourth conditional cumulants of the standardized errors $\epsilon_t$. Hence, sign predictability would arise from time variability in second and higher-order moments. This leads us to parametrize $p_t$ as a dynamic logit model:<br />$$p_t=\frac{e^{\theta_t}}{1+e^{\theta_t}}\quad\text{with}\quad\theta_t=\omega_d+\phi_d\mathbb{I}[r_{t-1}>c]+\pmb{y}^T_{t-1}\pmb{\delta}_d,$$<br />where the last term denotes macroeconomic variables (valuation ratios, interest rate) and realized measure (variance, bipower vriation, realized third and fourth moment of returns).<br /><h4>c) C<i>opula model</i><span style="font-weight: normal;">:To construct the bivariate conditional distribution of $R_t=[|r_t-c|, \mathbb{I}[r_t>c]]^T$ copula theory is used. In particular,</span><span style="font-weight: normal;">$$F_{R_t}(u,v)=C(F_{|r_t-c|}(u), F_{\mathbb{I}[r_t>c]}(v))$$where $F$ denotes the CDF and $C(u,v)$ is a copula. Most common choices are Frank, Clayton or Farlie-Gumbel-Morgenstern copulas. Once the three ingredients of the joint distribution of $R_t$, i.e. the volatility model, the direction model and the copula are specified, the parameter vector can be estimated by maximum likelihood. </span></h4><h4>Conditional mean prediction in decomposition model</h4>The main interest is the mean forecast of returns<br />$$E[r_t|F_{t-1}] = c - E[|r_t-c||F_{t-1}]+2E[|r_t-c|\mathbb{I}[r_t>c]|F_{t-1}]$$<br />In terms of inference<br />\[\hat{r}_t=c-\hat{\psi}_t+2\hat{\xi}_t\]<br />Under conditional independence or if conditional dependence is weak we have<br />$$\xi_t=E[|r_t-c||F_{t-1}]E[\mathbb{I}[r_t>c]|F_{t-1}]=\psi_tp_t.$$<br />so,<br />$$\hat{r}_t=c+(2\hat{p}_t-1)\hat{\psi}_t.$$<br />Under the general case of dependence, the copula estimation is essential.<br /><h3>Empirical Analysis</h3>TBD</div><br />Mhttp://www.blogger.com/profile/04296475349435748706noreply@blogger.com0tag:blogger.com,1999:blog-3299057174197331019.post-8123980537443574792015-06-28T15:13:00.000-07:002015-08-16T07:40:20.785-07:00Commodity index investing<script type="text/x-mathjax-config"> MathJax.Hub.Config({tex2jax: {inlineMath: [['$','$'], ['\\(','\\)']]}}); </script> <script src="http://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML" type="text/javascript"> </script><h4>Commodity Index investing and commodity futures prices - Stoll and Whaley (2009)</h4>Provide a comprehensive evaluation of whether commodity index investing is a disruptive force in commodity futures market in general. Institutional investors are active in it because of low correlation with stocks and bonds using managed futures, ETF, ETN and OTC return swaps. main conclusions are a) commodity index investment is not speculation b) rolls have little futures price impact, and inflows and outflows from index do not cause the prices to change. c) failure of wheat futures to converge to the cash price at expiration has not undermined the futures contract's effectiveness as a risk management tool.<br /><br /><h4>Limits to Arbitrage and Commodity Index investment: front-running the Goldman roll - Mou (2011)</h4>Rolling causes price impact. Front running has shown IR of 4.4 from 2000 to 2010. Profitability is positively correlated to size of index investment and amount of arbitrage capital employed. Talks about pre-rolling 10-1 business days before GSCI rolling.<br /><br /><h4>Speculators, Index investors, and commodity prices - Greely, Currie (2008)</h4>Index investors take the risk of prices (away from producers) and hence do not bring any information and do not effect the prices. Speculators bring information about supply and demand and effect the prices. These activities in turn lowers the cost of capital to commodity producers.<br />Mhttp://www.blogger.com/profile/04296475349435748706noreply@blogger.com0tag:blogger.com,1999:blog-3299057174197331019.post-70278921120958749622015-06-20T14:00:00.001-07:002015-08-16T07:40:14.488-07:00Developing high frequency equity trading models: Infantino and Itzhaki (2010)<script type="text/x-mathjax-config"> MathJax.Hub.Config({tex2jax: {inlineMath: [['$','$'], ['\\(','\\)']]}}); </script> <script src="http://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML" type="text/javascript"> </script><span style="color: #38761d; font-family: Arial, Helvetica, sans-serif;">Seconds to minutes horizon. PCA based equity market neutral reversal strategy combined with regime switching gives handsome results. </span><br /><h4><span style="font-family: Arial, Helvetica, sans-serif;">Ch 1: Introduction</span></h4><span style="font-family: Arial, Helvetica, sans-serif;">We want a short term valuation and identify the regime if the market will act in line of against the valuation. With so much noise, we should not expect high precision in our solutions. We only need to be slightly precise to generate decent alpha in a high frequency environment, with approx holding periods on the orders of seconds to minutes. By fundamental law of active management: $IR = IC \sqrt{Breadth}$. where, $IR$ is the information ratio, $IC$ is the Information coefficient (correlation between predicted and real values) and $Breadth$ is the number of independent decisions made on a trading strategy in one year. An $IC$ of 0.05 is huge!</span><br /><span style="font-family: Arial, Helvetica, sans-serif;"><br />Ultra high frequency traders (millisecond technology players) make their profits by providing liquidity. They do not attempt to correct the mispricing in high frequency domain (second to minute), due to their shorter holding periods (Jurek, Yang 2007).</span><br /><span style="font-family: Arial, Helvetica, sans-serif;"><br /></span><span style="font-family: Arial, Helvetica, sans-serif;">The model is a mean-reversion model as described in Khandani and Lo (2007) - 'what happened to the quants?' - to analyze the quant meltdown of August 2007. The weight of security i at date t is given by,</span><br /><span style="font-family: Arial, Helvetica, sans-serif;">$$w_{i,t}=-\frac{1}{N}(R_{i,t-k}-R_{m,t-k})$$</span><br /><span style="font-family: Arial, Helvetica, sans-serif;">where $R_{m,t-k}=\frac{1}{n}\Sigma_{i=1}^{N} R_{i,t-k}$. This is a market neutral strategy. Daily re-balancing correspond to $k=1$. These produce huge IRs at daily frequency and even more impressive numbers as the frequency is increased to 60 mins to 5 mins. This assumes every security has a CAPM beta close to 1 (which will be addressed using PCA).</span><br /><span style="font-family: Arial, Helvetica, sans-serif;"><br /></span><span style="font-family: Arial, Helvetica, sans-serif;">Avellaneda and Lee (2010) describe statistical arbitrage with holding period from seconds to weeks. Pairs trading is the fundamental idea based on the expectation that one stock tracks the other, after controlling for beta in the following relationship, for stock P and Q</span><br /><span style="font-family: Arial, Helvetica, sans-serif;">$$\frac{dP_t}{P_t}=\alpha dt+\beta \frac{dQ_t}{Q_t}+dX_t,$$</span><br /><span style="font-family: Arial, Helvetica, sans-serif;">where $X_t$ is the mean reverting process to be traded on. The stock returns can be decomposed to systematic and idiosyncratic components by using PCA giving</span><br /><span style="font-family: Arial, Helvetica, sans-serif;">$$\frac{dP_t}{P_t}=\alpha dt+\Sigma_{j=1}^{n}\beta_j F_t^{(j)}+dX_t,$$</span><br /><span style="font-family: Arial, Helvetica, sans-serif;">where $F^{(j)}_t$ represent the risk factors of the market/cluster under consideration.</span><br /><span style="font-family: Arial, Helvetica, sans-serif;"><br /></span><span style="font-family: Arial, Helvetica, sans-serif;">These ideas will be merged and utilized in a slightly different sense in this paper.</span><br /><h4><span style="font-family: Arial, Helvetica, sans-serif;">Ch 2: The model</span></h4><span style="font-family: Arial, Helvetica, sans-serif;"><span style="background-color: #ffe599;">Log returns and cumulative returns</span>: This is only the predictive part of the step. Regime switch will be tackled in next chapter. We use log returns ($ln(1+r_t)$ assumed to be normal) as compounding of returns is easy and normality holds when compounding returns over a larger period. Further prices have log normal distribution and log returns are close approximation of real returns, i.e. $ln(1+r) \approx r$, for $r<<1$. Also, by using cumulative returns we take advantage of CLT, and build a model to predict cumulative returns, with the cumulative returns of the principal components.</span><br /><span style="font-family: Arial, Helvetica, sans-serif;"><br /></span><span style="font-family: Arial, Helvetica, sans-serif;"><span style="background-color: #ffe599;">PCA</span>: We use PCA for valuation using OLS for predictive modeling. This statistical in nature as 'identity' of the risk factor is not cared about. At seconds time frame, instead of Debt to Equity ratio, Current ratio and Interest coverage it is the positioning and flow of hedge funds, brokers and asset managers which is much more a driving factor. Orthogonality of PCA also avoids multi-collinearity in OLS. PCA have also been shown to identify market factors without bias to market capitalization. Finally, PCA uses implicitly the variance-co-variance matrix of returns, giving different threshold for each stock reversion, based on different combination of PCs for each of them. This address the basic flaw of having to assume a general threshold for the entire universe, with a CAPM based beta close to one for every security.</span><br /><span style="font-family: Arial, Helvetica, sans-serif;"><br /></span><span style="font-family: Arial, Helvetica, sans-serif;"><span style="background-color: #ffe599;">Model description</span>: The steps are -</span><br /><span style="font-family: Arial, Helvetica, sans-serif;">1) Define the stock universe - 50 stocks randomly chosen from S&P500. 1000 will need clustering techniques. Collected top of the book bid-ask quotes on the tick data for each trading day (2009).</span><br /><span style="font-family: Arial, Helvetica, sans-serif;">2) Intervalize dataset - one-second intervals using the first mid-price quote of the second.</span><br /><span style="font-family: Arial, Helvetica, sans-serif;">3) Calculate log-returns - calculate log returns on the one-second mid-prices.</span><br /><span style="font-family: Arial, Helvetica, sans-serif;">4) PCA - For N assets and T time steps, demean and calculate the eigenvectors for the first k eigenvalues (of covariance matrix $\Sigma$) as columns into $\Phi$ and then calculate the dimensionally reduced returns $D$ of principal components.</span><br /><span style="font-family: Arial, Helvetica, sans-serif;">$$D = [\Phi^T(X-M)^T]^T$$</span><br /><span style="font-family: Arial, Helvetica, sans-serif;">where $M$ is the mean vector of $X$.</span><br /><span style="font-family: Arial, Helvetica, sans-serif;">5) Build prediction model - Following Campbell, Lo and MacKinlay (1997) we ran regression on future accumulated log returns with the last sum of H-period dimensionally reduced returns in the principal component space:</span><br /><span style="font-family: Arial, Helvetica, sans-serif;">$$r_{t+1}+...+r_{t+H}=\beta_1\Sigma_0^H D_{t-i,1}+...+\beta_{k}\Sigma_{0}^H D_{t-i,k}+\eta_{t+H,H}$$</span><br /><span style="font-family: Arial, Helvetica, sans-serif;">Which can be represented in matrix form as:</span><br /><span style="font-family: Arial, Helvetica, sans-serif;">$$S = \hat{D_t}B.$$</span><br /><span style="font-family: Arial, Helvetica, sans-serif;">To form the mean-value signal we add back the mean</span><br /><span style="font-family: Arial, Helvetica, sans-serif;">$$\hat{S}=S + M_t$$</span><br /><span style="font-family: Arial, Helvetica, sans-serif;">The base assumption is that the principal components explain the main risk factors that should drive the stock's returns in a systematic way, and the residuals are the noise we will try to get rid of. IF we see that the last H-period accumulated log-returns have been higher than the signal, we assume that the stock is overvalues and thus place a sett order. Thus the final signal is $\hat{S} - \Sigma^{H} r_i$.</span><br /><span style="font-family: Arial, Helvetica, sans-serif;"><br /></span><span style="font-family: Arial, Helvetica, sans-serif;">Since this is a liquidity providing strategy, trading cost should hurt less, relatively. A lag of 1 second is assumed.</span><br /><span style="font-family: Arial, Helvetica, sans-serif;"><br /></span><span style="font-family: Arial, Helvetica, sans-serif;"><span style="background-color: #ffe599;">Results</span> : shows a negative Sharpe of -1.84 for 2009 with a drawdown of -65% at an annualized volatility of 16.6%. Positive returns for first quarter and then negative.</span><br /><h4><span style="font-family: Arial, Helvetica, sans-serif;">Ch3: Regime switching model</span></h4><span style="font-family: Arial, Helvetica, sans-serif;">The mean reversion model itself is not profitable, at all times. Change of market behavior has to be determined beyond the fair value (sentiment?). The two main regimes in which the market work is momentum and mean-reversion. Under momentum regime we expect the returns to further diverge from the theoretical returns. Adaptive market hypothesis is applicable, particularly, to high frequency world, where 'poker' is played and irrationality would be common, Lo and Mueller (2010).</span><br /><span style="font-family: Arial, Helvetica, sans-serif;"><br /></span><span style="font-family: Arial, Helvetica, sans-serif;">Since principal components are the main risk factors, they are the ones who can justify the two regimes. Momentum regime is related to the sprouting of dislocations in the market - measured by the cross sectional volatility of the Principal components, $\sigma_D(t)$. The key observation is: <i>as the short term changes in $\sigma_D$ appeared to be more pronounced (identified by very narrow peaks in the $\sigma_D$ time series), cumulative returns from the basic mean-reversion strategy seemed to decrease (or momentum sets up).</i> Changes in $\sigma_D$ over time are defined by $\psi = d\sigma_D/dt$ and the cumulative returns of the basic strategy by $\rho(t)$. We define the measure $E_H$ as:</span><br /><span style="font-family: Arial, Helvetica, sans-serif;">$$E_H(t)=\sqrt{\Sigma_{i=0}^H[\psi(t-i)]^2}.$$</span><br /><span style="font-family: Arial, Helvetica, sans-serif;">There is a pretty consistent negative correlation between $E_H(t-1)$ and $\rho(t)$. This allows for the identification of the strength of mean reversion strategy in next second. The sprouting of principal components dislocation at time t triggers momentum at time t+1. The regime switching strategy would then follow $E_H(t)-E_H(t-1)$ at time t. IF this value is greater than zero, we understand that the dislocation is increasing and we trade on 'momentum', otherwise we stick to 'mean-reversion' behavior. We see that 'momentum' seems to be linked to the 'acceleration' of $\sigma_D$.</span><br /><span style="font-family: Arial, Helvetica, sans-serif;"><br /></span><span style="font-family: Arial, Helvetica, sans-serif;"><span style="background-color: #ffe599;">Results</span>: After applying the regime switching conditioning the Sharpe is +7.67 (2009) with a max drawdown of -1.45% at 10.03% annualized volatility.</span><br /><h4><span style="font-family: Arial, Helvetica, sans-serif;">Ch4: Potential improvements</span></h4><span style="font-family: Arial, Helvetica, sans-serif;">1) Clustering - A different set of stocks may give us very different PCs. To counter that we can cluster the stocks into smaller buckets, each characterized by their PCs.</span><br /><span style="font-family: Arial, Helvetica, sans-serif;">2) Selection of Eigenvectors - Selection of eigenvectors could be better. One can get a time series of number of eigenvectors that maximize the Sharpe at a given time, and then run an auto-regressive model to determine the number of eigenvectors to use in future.</span><br /><span style="font-family: Arial, Helvetica, sans-serif;">3) NLS - Instead of taking simple sum of the returns, one can do weighted sum using beta functions, changing it from OLS to an NLS problem. </span><br /><span style="font-family: Arial, Helvetica, sans-serif;">4) Other - Numerical speed can be obtained by using SVD decomposition instead of covariance matrix computations, using Marquardt Levenberg algorithm for NLS and GPU.</span><br /><h4><span style="font-family: Arial, Helvetica, sans-serif;">Ch5: Conclusions</span></h4><span style="font-family: Arial, Helvetica, sans-serif;">Alpha source found between ultra high frequency and traditional statistical arbitrage environment. </span><br /><div><br /></div>Mhttp://www.blogger.com/profile/04296475349435748706noreply@blogger.com0tag:blogger.com,1999:blog-3299057174197331019.post-33381823765278974982015-06-19T04:12:00.000-07:002015-08-16T07:40:09.962-07:00Trends in Quantitative Finance: Fabozzi, Focardi and Kolm (2006)<script type="text/x-mathjax-config"> MathJax.Hub.Config({tex2jax: {inlineMath: [['$','$'], ['\\(','\\)']]}}); </script> <script src="http://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML" type="text/javascript"> </script><div style="text-align: justify;"><span style="color: #38761d;">As the last few lines of the preface for this article says "an excellent and comprehensive survey of the challenges one meets in using quantitative methods for portfolio construction and forecasting." This is for non-technical audience but extremely relevant to technical people too, as an appetizer. The right 'putting pin on the board' article to read, before you indulge in your own personal niche research.</span></div><h4 style="text-align: justify;">Ch 1: Forecasting financial markets</h4><div style="text-align: justify;">A price/return process is predictable if its distribution depends on present information set and is unpredictable if its distribution is time-invariant.Market's partial predictability is theoretically inevitable. Markets are not made up of rational agents but agents practicing bounded rationality.</div><h4 style="text-align: justify;">Ch 2: General Equilibrium Theories - concepts and applicability</h4><div style="text-align: justify;">Market equilibrium is not unique. Same asset in various states can have different equilibrium values.</div><h4 style="text-align: justify;">Ch 3: Extended framework for Applying Modern Portfolio Theory</h4><div style="text-align: justify;">Mean variance optimization under Robustness is computationally efficient and is a second order conic problem. Approaches are - empirical, factor bases, clustering, Bayesian, stochastic volatility. Departure from normality is addressed using Monte Carlo. Concentrates on dispersion and downside risk.</div><h4 style="text-align: justify;">Ch 4 : Equity Tactical strategies (2d-2m)</h4><div style="text-align: justify;">CAPM (2 fund separation theorem) is a unconditional, static model. The world is conditional, dynamic. Bounded rationality and asymmetry of information creates repeated patterns.</div><div style="text-align: justify;"><br /></div><div style="text-align: justify;">1) <span style="background-color: #ffe599;">delayed response</span>: Leader companies are effected by news first, which then diffuses to lagged companies Kanas and Kouretas (2005). Bhargava and Malhotra (2006) use co-integration to give a definitive answer to the distribution of agent response to same information. </div><div style="text-align: justify;">2)<span style="background-color: white;"> </span><span style="background-color: #ffe599;">momentum</span>: 3 to 12 months. Lo and MacKinlay (1990) analyze a tool to detect whether momentum depend on individual asset or cross-auto-correlation effects.</div><div style="text-align: justify;">3) <span style="background-color: #ffe599;">Reversal</span>: less used and more potential. Timing the end of momentum is a necessary ingredient. Jegadeesh and Titman (1993, 2001) and Lewellen(2005) document it.</div><div style="text-align: justify;">4) <span style="background-color: #ffe599;">Co-integration and Mean Reversion</span>: Returns can't be co-integrated (stationary), prices are. Identifying pairs is the common approach but not the most successful. Common trends and co-integration relationships should be explored overall.</div><div style="text-align: justify;"><br /></div><div style="text-align: justify;">Static models are not predictive, dynamic are. Only simple dynamic models can be statistically estimated. This is restricted by limited data for complex models and risks and transaction cost considerations.</div><div style="text-align: justify;"><br /></div><h4 style="text-align: justify;">Ch 5: Equity Strategic Ideas (mths - yrs)</h4><div style="text-align: justify;"><span style="background-color: #ffe599;">Aggregation over time</span>: fractal idea is not completely right. Regressive and auto-regressive models are defined by time horizon of correlation and auto-correlation decay. GARCH does not remain invariant after time aggregation. Regime-shift models also exhibit time scale. Using high frequency correlation to estimate long term correlations. </div><div style="text-align: justify;"><span style="background-color: #ffe599;">Market behavior at different time horizon</span>: Day and weeks – depend on trading practices and how traders react to news, long run – quantity of money, global economic performance. </div><div style="text-align: justify;"><span style="background-color: #ffe599;">Recognizing regime shifts</span>: discrete shift(e.g. Hamiltonian model) vs continuous shift (e.g. GARCH). </div><div style="text-align: justify;"><span style="background-color: #ffe599;">Estimating Approximate models on moving windows</span>: Once regime break is detected, the moving window should become agile to account for that. Static model causes exponentially diverging price, which empirically is a Pareto distribution. Dynamic model can do right long term modeling. Linear models can't capture long term regime shifts, but only periodic movements with a fixed period.</div><div style="text-align: justify;"><br /></div><div style="text-align: justify;">Nonlinear coupling of two dynamic models have been successful – GARCH, Hamilton, Markov. </div><div style="text-align: justify;"><br /></div><div style="text-align: justify;"><span style="background-color: #ffe599;">Mean Reversion of log of prices</span>: compounding – If auto-correlation less than 1, process oscillates around a trend.</div><div style="text-align: justify;"><span style="background-color: #ffe599;">Variance ratio test</span>: If the variance grows less rapidly with time, there is mean reversion (Lo MacKinlay 1988). Differentiate trend stationary process and random walk with linear trend. The variance of a random walk keeps on growing with time, but the variance of a trend-stationary process remains constant.</div><div style="text-align: justify;"><span style="background-color: #ffe599;">Central tendency </span>- Stock prices can only have linear or stochastic trend.</div><div style="text-align: justify;"><span style="background-color: #ffe599;">Time diversification</span> – Less risky in long run vs short run. </div><div style="text-align: justify;"><br /></div><h4 style="text-align: justify;">Ch 6: Machine Learning</h4><div style="text-align: justify;">Machine learning is progressive learning. AI is useful, but its merits and limits are now more clearly understood. The link between algorithmic process and creative problem solving is the concept of searching. </div><div style="text-align: justify;"><span style="background-color: #ffe599;">Neural Network</span> – To generalize layers and notes have to be restricted. Hertz, Krogh and Palmer (1991) give mathematical introduction. Used widely with mild success. </div><div style="text-align: justify;"><span style="background-color: #ffe599;">SVM</span> – Performance generally superior to NN. </div><div style="text-align: justify;">Classification, regression Trees – satisfactory results using CART.</div><div style="text-align: justify;"><span style="background-color: #ffe599;">Genetic Algorithms</span> – satisfactory to remarkable prediction. Used in asset allocation (Armano, Marchesi and Murru 2005). Used to select predictors for NN (Thawornwong and Enke 2004).</div><div style="text-align: justify;"><span style="background-color: #ffe599;">Text mining </span>- Automatic text handling has shown promise. </div><div style="text-align: justify;">The possibility of replacing intuition and judgment is remote. ML is more extensive and nonlinear handling of broader set of data. </div><div style="text-align: justify;"><br /></div><h4 style="text-align: justify;">Ch 7: Model Selection, Data snooping, over-fitting, and model risk </h4><div style="text-align: justify;">Alpha ideas need creativity but testing and analysis of models is a well-defined method with scientific foundation. Simpler models are always better than complex models if equally explainable. Trade-off between model complexity and forecasting ability. To avoid looking for chance patterns, one must stick rigorously to the paradigm of statistical tests. Exceptional patters (over smaller subset) are generally spurious. Good practice calls for testing any model against a surrogate random sample generated with a the same statistical characteristics as the empirical sample. Out of sample testing should be the norm. Try a new model on already used data will 'always involve some data snooping'. Re-sampling and cross validation should be used for parameter selection. </div><div style="text-align: justify;"><br /></div><div style="text-align: justify;">If the market is evolutionary (changing slowly) it can be calibrated on moving windows. If it has regime shifts Morkov switching models can be used or alternatively random coefficient models (Longford 1993) could be used (averaging the results of different models). </div><div style="text-align: justify;"><br /></div><h4 style="text-align: justify;">Ch 8: Predictive Models of Return</h4><div><div style="text-align: justify;">Risk and returns are partially predictable. The more difficult question is how return predictability can be turned into a profit - to keep risk return trade-off positive. </div></div><div><div style="text-align: justify;"><br /></div></div><div style="text-align: justify;">1) <span style="background-color: #ffe599;">Regressive models</span> - regress return on predictive factors</div><div style="text-align: justify;">Two types of dependence of Y on X - first, distribution of Y depend on X and expected value of Y depend on X - second, distribution of Y depend on X, but expected value of y does not depend on X. Concept of regression does not imply any notion of time - time dependence must be checked, e.g. correlation between noise terms. R-square determines the fitness.</div><div style="text-align: justify;">a) <span style="background-color: #fff2cc;">Static regressive models</span> - not predictive, like CAPM, but uncover unconditional dependence.</div><div style="text-align: justify;">b)<span style="background-color: #fff2cc;"> Predictive regressive models</span> - lagged predictive regression. Distributed lag models to uncover rate of change.</div><div style="text-align: justify;"><br /></div><div style="text-align: justify;">2) <span style="background-color: #ffe599;">Linear Auto-regressive models</span> - a variable is regressed on its own lagged past values. One variable is called AR and multiple variable is called VAR. Vector auto-regressive models can capture cross-auto-correlations, should be used with few factors to reduce number of variables. Two types:</div><div style="text-align: justify;">a) <span style="background-color: #fff2cc;">stable VAR models</span> - generate stationary process. Response to stable VAR model to each shock is a sum of exponentially decays (with exponent <1) - e.g. EWMA, with most recent having the most effect. Some solutions may be oscillatory which some damped but all remain stationary exhibiting auto-correlation. </div><div style="text-align: justify;">b) <span style="background-color: #fff2cc;">unstable VAR models</span> - explosive (exponent >1) or integrated (exponent =1). In integrated process, shocks accumulate and never decay (e.g. Log price), error terms are auto-correlated. Generally, first difference gives a stationary process. VAR process with individually integrated series may have a linear combination that is stationary - this is called <b>co-integration</b> (Engle and Granger 1987). Regression of non-stationary (e.g. trends) time series can't rely on R-square or detrending. If both sides have integrated processes test for co-integration, to ascertain meaningful regression. For n integrated processes with k co-integrating relationships (of possible 1 to n-1), there will be n-k common trends such that every other solution can be expressed as a linear regression on these common trends. </div><div style="text-align: justify;"><br /></div><div style="text-align: justify;">3) <span style="background-color: #ffe599;">Dynamic Factor Models</span> - predictive regressive models where predictive factors follow a VAR model. These are compact formulation of state-space model with - observable and hidden state variables. Motivation is generally to reduce dimensionality. Both integrated and stable processes can be modeled. The solution are sum of exponential. Ability to mix levels and differences add to forecasting possibilities. </div><div style="text-align: justify;"><br /></div><div style="text-align: justify;">4) <span style="background-color: #ffe599;">Hidden-variable models</span> - best known linear state-space models are ARCH/GARCH. nonlinear state-space models are MS-VAR (Markov switching vector auto-regressive) family. e.g. Hamilton model (Hamilton 1989) - one random walk with drift for economic expansion and other with a smaller drift for periods of economic recession - switching regime model. </div><div style="text-align: justify;"><br /></div><div style="text-align: justify;"><h4>Ch 9: Model Estimation</h4>Determining the parameter of the model. Robust estimation is becoming more important to discount noisy inputs. Being a function of data an estimator is a random variable and hence have a sampling distribution. Finite data size is a challenge. Estimation always have a probabilistic interpretation. Three methods:<br />1) Least-square estimation - OLS method of best projection on subspace.<br />2) Maximum-Likelihood - maximizing the likelihood of sample given assumption of the underlying distribution.<br />3) Bayesian estimation - posterior distribution is prior distribution multiplied by the likelihood. One need to know the form of distribution.<br /><br />Robust estimation<br /><span style="background-color: #ffe599;">Matrices</span> - PCA can be used to reduce dimension, select smaller set of factors and reduce noise.<br /><span style="background-color: #ffe599;">Regression models</span> - optimal linear estimator OLS, if residuals are normal it coincides with ML. Auto-correlation of residuals does not invalidate OLS, but makes it less efficient. Can be made more robust by making it less sensitive to outliers. Variable selection can be done by penalized objective functions (Ridge, Lasso, elastic net). Clustering makes estimates more robust by averaging - shrinkage, random coefficient model, contextual regressions (Sorensen, Hua, Qian 2005).<br /><span style="background-color: #ffe599;">Vector auto-regressive models</span> - multivariate OLS. For specific assumptions and limits ML can be used. Unstable VAR is generally concerned with co-integration detection and uses state of art ML method (Johansen 1991, Banerjee and Hendry 1992). Bayesian VAR (BVAR) can be used e.g. Litterman (1986) method.<br /><span style="background-color: #ffe599;">Liner hidden-variable models</span> - Kalman filter. Two main methods - ML based and subspace based.<br /><span style="background-color: #ffe599;">Nonlinear hidden-variable models</span> - MS-VAR can be estimated using EM algorithm.<br /><br /><h4>Ch 10: Practical consideration with optimizer</h4>softwares are available but understanding of the procedure is important for robustness and accuracy. Fortunately in finance, most problems have one unique optimal solution. Standard forms are - linear, quadratic, convex, conic, nonlinear. Markowitz's optimization is quadratic. Quadratic, convex and conic have unique solutions.<br /><br />Solving the optimization problem - formulate problem, choose optimizer, solve problem. Re-sampled optimization can be used.<br /><br /><h4>Ch 11: Industry survey</h4>US $4 trillion asset under management. 2005. US and European managers.<br /><br /><span style="background-color: #ffe599;">Equity return forecasting techniques</span> - Simple methods with economic intuition preferred. Momentum and reversal are most widely used. Regression using predictors like financial ratios is the bread and butter. Desire to combine company fundamentals to sentiments. Auto-regressive, co-integration, state-space, regime-switching, nonlinear methods (NN, DT) are central in some companies. Growing interest in high-frequency data. <br /><span style="background-color: #fff2cc;">Models based on exogenous predictors</span> - operating efficiency, financial strength, earning quality, capital expenditure etc as core bottom-up equity model. Fundamental combined with momentum/reversal models.<br /><span style="background-color: #fff2cc;">Momentum and reversal models</span> - use of multiple time horizon, most widely used, turnover is a concern - weighting and penalty functions used to mitigate it.<br /><span style="background-color: #fff2cc;">co-integration models</span> - performance sensitive to liquidity and volatility. many firms use it as it models short term dynamics and long term equilibrium.<br /><span style="background-color: #fff2cc;">Markov-switching/regime-switching models</span> - not used widely as market timing is difficult to predict.<br /><span style="background-color: #fff2cc;">auto-regressive models</span> - Not been widely evaluated! A step ahead of momentum models. Over-fitting is cautioned.<br /><span style="background-color: #fff2cc;">state-space models</span> - not widely used.<br /><span style="background-color: #fff2cc;">nonlinear models</span> - potential in NN, DT. Some already use it.<br /><span style="background-color: #fff2cc;">models of higher-moment dynamics</span> - e.g GARCH, used by very few.<br /><br /><span style="background-color: #ffe599;">Model risk mitigation techniques</span> - mostly co-variance matrix estimation.<br /><span style="background-color: #fff2cc;">Bayesian estimation</span> - hard to implement explicitly but used implicitly.<br /><span style="background-color: #fff2cc;">Shrinkage/Averaging</span> - widely used.<br /><span style="background-color: #fff2cc;">random coefficient models</span> - hardly used. randomization of data to ensure against over-fitting used.<br /><br /><span style="background-color: #ffe599;">Optimization techniques</span> - half the firms use it, other half distrust it.<br />Robust optimization - re-sampling methods<br />multistage stochastic optimization - hardly used.<br /><br /><h4>Ch 12: Today and tomorrow</h4>Changing markets is a reality. Supervision is required. Regression is the king today. Text mining may be becoming hot.</div><pre class="western"></pre>Mhttp://www.blogger.com/profile/04296475349435748706noreply@blogger.com0