13 January, 2017

A list of quant funds / tickers

Keeping Track of Quant Funds
Columbia Threadneedle publishes a list of quant funds/Bloomberg tickers which presumably it holds in portfolio. Probably this is a regulatory requirement. These are not representative of quant funds and they do not appear to be in any way the best quant funds.

I have cut and pasted a list from Jan 17, just in case the link goes inactive.


Columbia Threadneedle Jan 2017 Quant Fund Indices.





Columbia Threadneedle Quant Fund Indices (updated monthly)

12 January, 2017

Backtest overfitting, charlatanism and touchy-feely math heuristics

A few readings on overfitting

The basis for these papers is sound. Significant amounts of overfitting is done in the name of so-called quant or smart beta funds.  While in typical statistical or econometric models, methodologies to a avoid this have golfed over the years, from statistical testing procedures, criteria, various cross validation (and bootstrap confidence interval derivations), almost all methods, require an actual forecast. Yet in many algorithmic trading strategies, there is no forecast, only an allocation of weight for the strategy.  (We will discuss why this approach of estimating a weight directly rather than estimating a forecasting model which is adapted to produce a weight, is in many ways much more efficient in a later discussion). The strategy itself is judged by a number of factors primary among them being Sharpe ratios. The issue is that Sharpe ratios, when optimised in-sample (IS) may be spurious and the resulting strategies lead to poor out-of-sample (OOS) performance. 

While p-hacking or overfitting or irreproducibility has been studied for some time, including in finance (Lo and MacKinlay, Halbert White's Bootstrap Reality Check Test, Romano-Wolf etc) it seems to have recently been rediscovered by a group of mathematicians and by Campbell Harvey. We will review the recent papers, try to put them in context and explain the minimal impact they have offered to the current understanding. Finally, we will look into Campbell Harvey's more recent  Lucky Factors analysis which not only properly contextualises his analysis, it helps advance the theory of search for data mined (equities quant factors).



Bailey et al - Financial Charlatanism
There are many approaches to how to deal with this overfitting. Bailey et al consider one approach to strategies which leads to some interesting heuristics but in some sense is far from practicable. In their paper, they consider a large sample approach to Sharpe ratios. Given a distribution of returns which is normal iid, they quote Andy Lo's result on the asymptotic sampling distribution for the Sharpe ratio (as is typical, it is asymptotically normal). But given the idea that strategies maximise IS Sharpe ratios, it is more relevant to consider the max Sharpe ratio (rather than the $E[\hat{Sharpe Ratio}]$ and the $stdev[\hat{Sharpe Ratio}]$). In particular, they look to the asymptotic distribution of the maximum for Sharpe ratios on iid $N(0,1)$ returns, ie, those for which then expected value is zero. Using a standard result from EVT they find the distribution for $E[max{x_n}] $for $x_n$ distributed as the z score of an $N(0,1)$ iid distribution. This is tantamount to considering multiple independent trials which are meant to depict strategies. 

Given this EVT style distribution and its time-scaling the authors are able to put forward a minimal Backtest length for a given number of independent strategies, when maximised across strategies, to be below a Sharpe ratio of 1. Again, they consider strategies that should have Sharpe of 0 (OOS it is expected to be zero) and then look at the expected maximum in sample Sharpe. For a given set N of strategies they want to find the minimum backtest length minBTL such that the max Sharpe ratio is below 1. In spite of the claim that you need at least a certain number of data points to backtest a given number of strategies! it is not clear whether this is really achieved in the scope of their definition. 

While instructive there is almost nothing in the presentation which is actually applicable. Strategies are rarely chosen from a discrete set and then optimised. The only discrete strategies which I can conceive of are moving average rules (22 days or 35 days?). They are sometimes constrained to have only a few parameter choices. 

As the authors say, anything continuous is econometric and beyond their scope of study.  I would claim that all models ever considered are econometric (EWMA are econometric. Vol scaling is econometric even if the underlying rules are single MA rules). Moreover, how can we define independent for strategies? As though they are random samples.  Is a crossing moving avg rule of 30-60 days independent from a crossing moving average rule of 15-20 days? We have no guidance, no formal definitions. In fact, I would say that the paper is not even what we might deem to be mathematics since the results are merely heuristic. Interesting but close to useless. 

The paper is partly a rant against overfitting with so-called technicals.  It is partly diatribe. While being altogether sympathetic with this critique, that the PAMS should be a vehicle for diatribe is, IMHO, quite inappropriate.  


Harvey Liu,  Backtesting
Harvey takes quite a different tack and grounds his approach to more familiar multiple testing results. He appears unaware of the significant literature on multiple testing in finance (partly because the papers were in the econometrics literature rather than the financial econometrics literature). Nonetheless, Harvey's explanations are sound and reasonable. Harvey considers Sharpe ratios to have t-distributions and thus each Sharpe ratio is effectively a test that the Expected excess returns of a strategy are above zero and with this he can define a p-value. Harvey then considers the theory of multiple tests.

We first review testing and errors. A Type I error is a false positive. A Type II error is a false negative. So if in reality a hypothesis is false but using our method we claim it to be true, then this is a Type I error. If the hypothesis was true and we mistakenly claim it to be false, it is a Type II error. 

To make it more complicated,  in statistics, we usually have a null hypothesis, $H_0$ and we want to test. If we mistakenly accept $H_0$ when it is actually false it is a type I error. We usually control Type I errors by controlling the significance level of a test statistic. If we choose a 5% significance we are saying we are ok with making an error one in every 20 trials.  If we choose 1% we are only OK with making an error one in every 100 trials. 

The problem with multiple testing is we cannot perform 100 trials and just choose the one test which happens to pass. The probability that at least one tear has a p-value which is below our significance cutoff increases dramatically as we increase the number of tests.  The testing procedure is valid for just a single test. If one wants to consider multiple possibly correlated tests then the resulting p values are incorrect and need some adjustment. 

Harvey considers the sample size and a predetermined number of models under consideration (no concept of independent strategies here a la Bailey) works on an adjustment to all Sharpe ratios. Three standard adjustments are considered, Bonferroni, Holm and Benjamini-Hochberg-Yakutieli.  The first two are adjustments to p-values to prevent multiple (possibly correlated) tests from resulting in exactly one Type I error. If a type I error is denoted by $N_r$ then we are controlling the Family-Wise Error Rate  $FWER =P\{N_r> 1\}$. The BHY is meant to control the False discovery Rate or $E[N_r/R]$ where $R$ is the total number of tests run and $N_r$ is the number of false positives.

Using any of the above p-value adjustment methods (take raw p-values and alter them to control either FDR or FWER at the cost of absolute power of the test stats), Harvey manages to take Sharpe ratios, and upon adjusting for new p-values, can reapply the t-distributions to derive adjusted Sharpe ratios which haircut multiple tests and are meant to be more robust OOS.

The paper is quite intuitive and is worth a read. It does not offer much new to the scientific literature but is a good introduction. 

White and others
Unfortunately, Harvey was not aware of the significant work by White and others on dealing with adjustments to p-values to correct for multiple testing  but also with higher power. It turns out the Holm and Bonferroni methods are quite extreme and result in a large loss of power for tests. Consequently in using them, one is making lots of type II errors and rejecting strategies that actually work.  White attempts to take this into account by estimating the correlation between the tested hypotheses via bootstrap. The resulting method can be used for any arbitrary number of models to determine whether at least one of them is significant or has a positive excess return etc.  White's method has been used and reused 

  • To debunk day of the week effects
  • To debunk hundreds of technical trading models.  
  • To show momentum indeed does offer significant positive returns 
  • To determine the set of models which outperform  Romano Wolf. In fact, this was a major innovation and improved quite considerably upon Whites method 
  • To determine model confidence sets. Which models do indeed outperform others so we can rank and which groups have indeterminate ranking. Model Confidence Sets
  • To debunk all the equities factors being farmed almost continuously

References

Harvey-Liu  Backtesting
Harvey Liu. Lucky Factors Lucky Factors

Hansen-Lunde-Nason The Model Confidence Set

23 February, 2014

Islamic Finance - Links of note

Just a quick link to some of my Islamic Finance publications (which are not otherwise contained on the Islamic Finance Resources website/blog):

Opalesque Islamic Finance Intelligence: Opalesque IFI Archive

There are several articles in there which now get referred to quite a lot (on Istijrar and Existential Risk primarily). The link to the existential risk column is here.

Most of this work was done prior to 2009 when I joined Nomura, although publication may have been subsequently.


19 August, 2013

DSGEs don't define priors!

I cannot think of a more difficult problem than forcing economic data to confirm economic models. Most economists have effecting ceded this and given into some effective mantle of bayesianism.

Hence the rise of DSGEs and their use as a means of forming a prior to a Bayesian VAR model.

Rather than to use MLE which is known to be unwieldy when fitting nonlinearly constrained VARs, economists now use the underlying model the DSGE to define a prior. (And so do the finance types, eg Ang-Piazzessi and the largely useless machinery-in the context of data fitting- of affine models for no arbitrage restrictions). 

Now let us be clear a DSGE can in no way define a prior. As an equation it can at best define a manifold in parameter space. This can be the max likelihood surface or the effective mode of this distribution but it cannot tell you the metric or any more about the actual prior distribution.

So how do economists even begin to think of using DSGEs to form priors for their VARs?

Go figure.

I will post links to books/papers on topic later on. I am certain that in no way can it truly make complete sense to anyone. (unless we can determine in more detail--flesh out---the notion of informative and noninformative "directions" in parameter spaces....even this will be challenging in that it will be easier to have informative/noninformative subspaces rather than manifolds!).

14 July, 2013

Why must we have priors?

Letter to Andrew Gelman

Andrew
While I am absolutely sympathetic to the Bayesian agenda I am often troubled by the requirement of having priors. We must have priors on the parameter of an infinite number of model we have never seen before and I find this troubling. There is a similarly troubling problem in economics of utility theory. Utility is on consumables. To be complete a consumer must assign utility to all sorts of things they never would have encountered. More recent versions of utility theory instead make consumption goods a portfolio of attributes. Cadillacs are x many units of luxury y of transport etc etc. And we can automatically have personal utilities to all these attributes.  

I don't ever see parameters. Some model have few and some have hundreds. Instead, I see data. So I don't know how to have an opinion on parameters themselves. Rather I think it far more natural to have opinions on the behavior of models. The prior predictive density is a good and sensible notion. Also if we has conditional densities for VARs then the prior conditional density. You have opinions about how variables interact and the forecast of some subset conditioning on the remainder.  That this may or may not give enough info to ascribe a proper prior in parameter space all the better. To the extent it does not we must arbitrarily pick one (eg reference prior or maxent prior subject to the data/model prior constraints).  Without reference to actual data I do not see much point in
trying to have any opinion at all. 

Please do let me know your thoughts. 

Best
Nick

Nick Firoozye


11 July, 2013

Why aren't there more Bayesian Econometricians?

From a letter to Brian Caplan and ccing Andrew Gelman. 

Brian,

As a prologue, I cced Andrew Gelman who is a professor of Poli Sci at Columbia and a very prominent Bayesian Statistician. He too has written on the sparsity of Bayesian economics and I cced him because I thought he too would find this email of interest given his very public advocacy of all methods Bayesian in science (and social sciences!) and having had interactions with Andrew before, I know he may even want to post this on his own blog! (seehttp://andrewgelman.com/  -- It is truly high quality).

The comments section on your post (on the blog posthttp://econlog.econlib.org/archives/2009/11/why_arent_acade.html#91157) are closed which is most natural due to its age.

I thought nonetheless I should write to you about Bayesian Econometricians as an addendum. Obviously some commentatorswere wrongly disposed to think economists do not have the bandwidth to understand Bayes’ rule. It is a much deeper historical (hysteresis) problem. I would not say that Phillips or Granger do not have the bandwidth—far from it. Surprisingly, they are wrong-headedabout their philosophy as though the Pearson-Neymann Fisherphilosophy (repeated experiments, really now? In economics? Guffaw! LoL!) was the most applicable to economics alone even though it is increasingly out of favour in the physical sciences. This is so wrong it is hard to fathom. Meanwhile, most economists I speak with know full well how Bayes’ rule works if they have never actually used it in practice!

In physics, many prominent physicists are active supporters of the Many Worlds Interpretation (i.e., http://en.wikipedia.org/wiki/Many-worlds_interpretation) as opposed to the dominant Copenhagen Interpretation -  Hawkings among others. Edwin Jaynes claimed that the Copenhagen school with its’ god-like Observers was a form of Mind Projection Fallacy. The big difference is Jaynes and Jeffreys(and Hawkings) all believe that probability is in the mind of the beholder. It measures our own lack of information. For Einstein, God does not roll dice. Reality is all deterministic even if the equations are not.

I don’t see why Economics should be different. We do not know how economic agents will act. Does that make them “random”? As though randomness was some physical property assigned to possible states of the world? No not in any way. Randomness is in the mind of the person who does not know all. De Finetti and Ramsey, Savage and Jeffreys (and Richard Jeffrey, see e.g.,http://en.wikipedia.org/wiki/Richard_Jeffrey)  were all onto something very real when it was often quashed by the (then) orthodoxy of Fisherian school with all its triple negatives (“we cannot fail to reject the null hypothesis of X” Tell me who really knows what that means?).Bayes must be accepted if only because Ockham’s razor tells us, the interpretation is far easier in the long run. Meanwhile interpreting probabilities using DBAs (Dutch Book Arguments) a la Ramsey, gives them a very real (risk-neutral) pricing interpretation. Uncertainty then is an inability to find a unique mid. It all makes sense in a self-consistent economics interpretation sort of way. Why rely on ontological arguments for the existence of some other-worldly probability that has physical meaning when it only need be a unique price a la information markets!?!?

What is truly shocking is that  Statistics departments have come fullswing and are full of Bayesians or agnostics (e.g., Donald Rubin and Brad Efron ), who use both Bayesian and (modern) frequentist methods (e.g., Bootstrap, Cross validation, frequentist nonparametrics) whenever they suit the problem to hand. But some of the best Bayesian stats goes on outside statistics departments (e.g.,in Poli Sci, see Andrew Gelman’s website, http://andrewgelman.com/ or Simon Jackman, http://jackman.stanford.edu/blog/ or in Physics and Comp Sci Departments, usually in areas of Machine Learning, e.g., David MacKay, Judea Pearl, et al).  Nonetheless this world outside economics is Bayes’ friendly! (see e.g.,http://videolectures.net/ for a plethora of pro-Bayes’ lectures from CS and other areas, and see http://www.youtube.com/watch?v=8oD6eBkjF9o for the reception that the likes of Google gives authors who advocate Bayes’ rule—author S B McGrayne of The Theory That would Not Die). Bayesian methods now form the basis for much of Machine Learning, probably the most successful version of Artificial Intelligence there is, giving good reason for theirprominence in CS departments.

But Econometrics departments are full of die-hard Frequentists --Phillips and Granger and Gouriereux et al with their shockinglycomplex assumptions, complex mathematics, large sample theory (when was an economics sample ever that large? The only large sample is the one they generate using their computers!), etc -surprisingly wedded to their philosophy.

We can probably count the Bayesian Econometricians on the fingers of one hand—Zellner, Sims, Koop, Lancaster-- Maybe that’s it? (see e.g., http://en.wikipedia.org/wiki/Bayesian_econometrics).  I don’t understand why! Some areas (e.g., I(0), I(1)-stationarity, cointegration theory) do not lend themselves to Bayesian analysis (Sims famously called it a prior on a set of measure 0, to which Phillips replied that the Jeffreys’ prior (for a hypothesis not a model??? I never heard of a Jeffreys’ prior for a hypothesis before this or after this paper!) should in fact have a Dirac mass appropriately placed). In the end, Sims’analysis is more elegant, less formalistic, easier to understand and allows plenty of explosive forecasts in the predictive densities! Some Bayesians have tried to wed the two although it makes for the unnatural priors on concentrated on cones (of rank one matrices!).Much easier to forget it all and use a Minnesota Prior!

And surprisingly, virtually every economist and finance PhD I have ever spoken with talks regularly about their priors and posteriors in reference to models, to visions of reality, etc. All of their principal agent models have some sort of prior or posterior in them even if Bayesian updating is not explicit. It figures well into the way they seereality. Most theoretical economists/finance PhDs will have complicated (DSGE) models for reality which, when it comes down to it, justify a few RHS variables in a simple regression and they do an OLS or GLS if they’re really sophisticated, and are done with it. Theyopt for ease (and linearity) in their PhD theses!  In fact, I think F Black was one of the few theoreticians who didn’t seem to take on Bayes’wholeheartedly (see his paper on Noise-- http://www.e-m-h.org/Blac86.pdf). He speaks of knowledge all the time but forgets Bayes! It is clear that epistemological interpretations are far easier than others.

But the econometricians are all die-hard. Why?  It is econometricians who are the truly egregious sinners here, not wanting to state what they believe because it in the end sounds just plain ridiculous. This state of affairs is enough to make one opt for the religious fervour of Bayesians as one can see in the zeal of Zellner or the stridency of Sims (to alliterate!). I can only imagine it is because the math is harder. As a former math (asst) prof who used to do lots of convergence theory (weak-star in the sense of measures, weal in this or that Sobolev sense, etc), there is an appeal to the mostly impenetrable and completely unintuitive analysis. Quasi-MLE and simulated method of moments. Wow. Amazing what you can do with millions of computer generated observations. Too bad we only have one historical path in reality—the stock market only did as it did once and we can’t seem to generate any other reasonable histories -- As a realist I see little sense in the methods.

For me, a PhD in Applied Math (in Nonlinear PDE) who never used a statistical method before learning (the frequentist) methods in fashion in Wall St, I found Bayes’ theory and philosophy a major relief. No more triple negatives. No more having to explain the craziness of hypothesis testing and why you chose 10% rather than 5% or 1% as a CI (because I got the results I wanted????). I only wish I could be more of a Bayesian in practice!!!

I might even liken the use of Bayesian methods in Economics to the use of Sabrmetrics in Baseball. There will come a time when the revolution happens and the frequentists will just have to stand back and give in, or be swamped.

Should we make it an active plan to promote Bayesian methods in Economics departments? To grow a new generation of Econometricians who are freed from the shackles of overly formalistic large sample theory favoured by frequentists? Should all intro courses in Econometrics be taught out of Koophttp://www.amazon.co.uk/Bayesian-Econometrics-Gary-Koop/dp/0470845678  or Lancasterhttp://www.amazon.co.uk/Introduction-Modern-Bayesian-Econometrics/dp/1405117206/ref=sr_1_4?s=books&ie=UTF8&qid=1373363994&sr=1-4&keywords=bayesian+econometrics  Should we actively work on the demise of a school of philosophy (frequentist theory or its “bastard child propensity theory) which are so obviously lacking in any sensible interpretation?

Any comments or answers to my queries would be most welcome.

Best regards,

Nick Firoozye

_____________________________________________

Jaynes on the Mind

Edwin Jaynes: some refs 

The major thesis-Randomness is all in the mind. Epistemological interpretations of probability. Bayesian objectivist and sometimes subjectivist (a la de Finetti and Ramsey). Big believer in Jeffreys and the use of his objectivist priors or maxent methods for fixing (maximally uninformative) priors. 


The book:
Probability Theory: The Logic of Science: Principles and Elementary Applications Vol 1 [Hardcover]  (only volume 1 available)
 
 
The entire book as PDF:
 
 
 
The website:
http://bayes.wustl.edu/  (worth a look, links to all). 

 
 
http://en.wikipedia.org/wiki/Edwin_Thompson_Jaynes Real strident fellow. Think he may have been a Republican? Or maybe it was just the fashion of the times!
 
Jaynes’ attack on Copenhagen Interpretation
 
 
Refs on Interpretations of Quantum
 
 
 
Note that many interpretations including the now popular many worlds interpretation do not require a belief in physical randomness and instead are fully deterministic.  

 
nterpretationAuthor(s)Deterministic?Wavefunction
real?
Unique
history?
Hidden
variables
?
Collapsing
wavefunctions?
Observer
role?
Local?Counterfactual
definiteness
?
Universal
wavefunction

exists?
Ensemble interpretationMax Born, 1926AgnosticNoYesAgnosticNoNoneNoNoNo
Copenhagen interpretationNiels Bohr,Werner Heisenberg, 1927NoNo1YesNoYes2CausalNoNoNo
de Broglie–Bohm theoryLouis de Broglie, 1927,David Bohm, 1952YesYes3Yes4YesNoNoneNoYesYes
von Neumann interpretationJohn von Neumann, 1932,John Archibald Wheeler,Eugene WignerNoYesYesNoYesCausalNoNoYes
Quantum logicGarrett Birkhoff, 1936AgnosticAgnosticYes5NoNoInterpretational6AgnosticNoNo
Many-worlds interpretationHugh Everett, 1957YesYesNoNoNoNoneYesNoYes
Popper's interpretation[46]Karl Popper, 1957[47]NoYesYesYesNoNoneYesYes13No
Time-symmetric theoriesYakir Aharonov, 1964YesYesYesYesNoNoYesNo
Stochastic interpretationEdward Nelson, 1966NoNoYesNoNoNoneNoNoNo
Many-minds interpretationH. Dieter Zeh, 1970YesYesNoNoNoInterpretational7YesNoYes
Consistent historiesRobert B. Griffiths, 1984Agnostic8Agnostic8NoNoNoInterpretational6YesNoNo
Objective collapse theoriesGhirardi–Rimini–Weber, 1986,
Penrose interpretation, 1989
NoYesYesNoYesNoneNoNoNo
Transactional interpretationJohn G. Cramer, 1986NoYesYesNoYes9NoneNo14YesNo
Relational interpretationCarlo Rovelli, 1994NoNoAgnostic10NoYes11Intrinsic12YesNoNo