The latest joint residential seminar was hosted in Windsor over 2 days and included many interesting talks and tutorials around the theme of "New Data, New Methods." The conference included key note presentations from Campbell Harvey (Duke University) on a protocol for Quantitative Research, providing a great foundation for discussion around research methodologies and processes; and Allan Timmermann (University of California, San Diego) on break risk in asset returns.
Here we provide a brief overview of the talks but if you want more details including the bios of the speakers, slides and papers you can be found at the bottom of this page (Login to view them).
List of Talks
Global Market Inefficiencies - Sohnke Bartram
Dr Bartram presented a paper that focused on measuring the degree by which markets are efficient by determining the returns to a fair value estimate.
The fair value estimate explains pricing using balance sheet, income and cashflow statement information at each point in time, using almost 26000 prices observations through 1993 to 2016. The anomaly associated with this mispricing signal is then evaluated across 6 regions. The Q5 (undervalued) signal is exhibiting high Book-to-Price, low beta, and generally negative price momentum exposures. Resulting in a quintile spread of around 0.2% per month in the US to above 1% in Emerging markets. The results indicate that Asia Pacific and Emerging markets in particular are less efficient and hence provide more opportunity to benefit from mispricing. In general high transaction costs leads to higher alpha, confirming the inefficiency hypothesis.
The alpha is resilient to Fama-MacBeth Regressions controlling for common factors, using OLS and Theil-Sen time series approaches to estimate betas, confirming about 0.4% additional return for Emerging markets vs the US and global developed. The research then experiments with holding periods and analyses alpha decay, where emerging markets again show an opportunity to benefit from lower trading frequency as they have slow alpha decay and high transition costs, improving the net performance greatly.
Quant Research Protocol - Campbell Harvey
There has been amazing growth in the power of our ability to search for and fit models. Methods such as IBCC — Independent Bayesian Classifier Combination — appear to offer real promise, but as the search space grows so do the challenges of controlling the research process. While it will always be hard for investors to judge research quality, perhaps a research protocol or checklist could help?
Does the model have a solid economic foundation?
Does it look resilient to economic change, or crowding?
Where data handled and features picked in a principled way?
Is the cross-validation process credible?
Does the model consider structural change and resilience
How do your methods handle the curse of dimensionality?
Does the research culture rewarding quality, or only success?
Whats Wrong with Robo-Advisors? Andrew Rudd
Having observed the industry for the last 25 years Andrew summarised issues that have arisen with Robo advising. He highlighted that characterising investor risk preferences requires more than quantifying the standard deviation of returns. He recommended the use of a more behavioural based questionnaire to elicit investor utility. He also raised his concern that the advising role is usually separated from other key aspects of financial planning - such as taxation, retirement, and estate planning.
He encouraged i) a wider use of asset classes, ii) a greater focus on inflation, and iii) greater consideration of “crisis” events when building portfolios.
Andrew argued that the characteristics of typical robo-advised portfolio were not related to portfolios designed for retirement planning, and recommended that data such as savings and spending rates be incorporated in recommended portfolios for retirement.
In summary, Andrew felt it was better to use a combination of human and robo-advising, rather than either in isolation. To replicate the former, he thought machine learning could be applied to robo-advising to ensure the discipline adapts to events that impact investors.
Deep Learning Models for Time Series Data - Kevin Webster
Deep learning approaches are at the cutting edge of applications from handwriting and speech recognition to driving cars and playing games. This concentrated review spanned RNNs; LSTMs; autoregressive networks; masked and dilated convolutions; attention alignment scores; and latent variable modelling. The slides contain links to leading packages; key papers; and to approachable blog discussions of these fast-evolving areas in machine learning.
Crypto-Currency Volumes - Daniele Bianchi
Cryptocurrency markets are an interesting setting for studying the informativeness of trading volume for future prices because they are highly fragmented, attract a diverse set of traders and investors, and trading is continuous and unregulated. This paper examines the in-sample predictive ability of trading volume for high frequency future returns. Daniele’s research follows previous work suggesting a predictive relationship exists in other markets, but depends on trading motives (hedging v. speculation) and the likelihood that trading is informed. The reported findings fail to establish a direct predictive role of trading volume in cryptocurrency data. However, returns reversal is significant, and trading volume interacts significantly with contemporaneous returns in predicting future returns. The results suggest a profitable trading rule based on returns reversal and trading volume in the sample period (01/17-05/18). However, the likelihood of failure of cryptocurrencies is also a potentially significant and important factor.
Extracting Meaning from Unstructured Text - Steven Young
Identifying aspects of sentiment is the most common objective of the analysis of financial narratives (text), but applications reported in the academic literature generally have very short horizons. Steven’s talk addressed a range of questions including whether we be more creative in textual analysis, for example by identify misreporting by companies; CEO personality traits; changing management concerns; conflicting narratives; risk exposures; or financial constraints. On human timescales the possibility that these more subtle features of the investment environment leave a footprint in financial narratives are potentially interesting. Computational analysis can be used to suggest identify topics and cases for human evaluation, rather than just being viewed as self-contained models.
To successfully apply cutting edge computational linguistics methods, carefully designed preprocessing, stemming and disambiguation steps and a suitable dictionary are necessary; and annotation and tagging are highly desirable. All should be chosen to reflect your domain knowledge and the research objectives to hand.
Dynamic Bayesian forecasts for FX rates - Rainer Schussler
2^9 vector auto-regressive (VAR) models were constructed. Each model contains differing priors, which effectively apply differing soft constraints to the VAR parameters and consequently to model complexity. Dynamic Model Selection is used to select from among these, so that the model makes forecasts based on the model that has recently been most successful; and the most appropriate of "recent" is also determined endogenously by the model and turns out to be on the scale of 3-5 months.
Monthly G10 FX G10 data 1973-2016 is used, with interest rate parity; recent market returns; yield curve slope; and oil price, creating a multivariate-T forecast distribution for all rates. Roughly half of the time the winning model is a random walk with no structure at all, so the model is identifying relatively brief pockets of predictability. It averages a Sharpe ratio around 0.9.
Break Risk - Alan Timmermann
Combining multiple time series gives a huge increase in the ability to detect breaks or regime changes, provided that the breaks occur at the same times across different time series and even if the nature of the break differs. The model presented for stock returns allows both mean and covariance to shift at breaks and uses reversible jump MCMC because a birth/death process is required for breaks. However, it is highly efficient because conditional on break times the distribution is conjugate and so large universes can be handled. Priors ensure plausible R^2s, Sharpe ratios and smoothness.
A century of monthly US stock data suggests 10 breaks, each detected within about 3 months, and once made the break estimates appear relatively stable. Examining industries; size and momentum; or size and value portfolios suggests breaks at similar dates.
Sparse Macro Factors - David Rapach
Sparse PCA has dimension reduction performance comparable to PCA, while improving interpretability by, in their case, allowing only 12 nonzero components. Applying this technique to 120 macro variables in FRED-MD they obtain 10 apparently meaningful factors, including factors relating to term structure, inflation, manufacturing output, housing, unemployment, and optimism. Fitting a VAR(1) model to these the researchers obtain innovations for these factors, and a 3-pass regression suggests that some of the innovations are more successful predictors of returns than the innovations associated with traditional PCA components.
Optimal data identification in nonstationary systems - Jakob Krause
Any time series modelling task must consider the payoff between using longer series, which may be less relevant, and shorter ones which will be more variable. Modelling a nonstationary time series as Brownian motion with time-varying standard deviation allows window length to be chosen in a principled manner, whether the process evolves gradually or is subject to discrete breaks.
One demonstration considered the pro-cyclical impact of bank regulators using shorter periods for Estimated Shortfall estimation in response to rising volatility. Another attempts to improve the construction of a minimum variance portfolio.
Forecasting tomorrow's covariance - Ruy Ribeiro
Covariance estimated from 5-minute returns today is tough to beat and most methods of incorporating older data do worse. The authors claim an improvement on the scale of 15-25% by using well-shrunk contributions from the covariance estimates for the last 20 days; flags indicating the membership of 10 sectors; and a market factor. Six additional factors such as size and value are also considered but appear to add rather little. The key techniques used are LASSO and HAR, and estimating the log covariance matrix to ensure it remains PSD.
Login to view