Quantitative Finance Research is in Trouble: The p-hacking issue

Research in quantitative finance is not like research in other fields. In his last year’s article about recent trends in empirical finance Marcos Lopez de Prado exposes some of the fundamental issues that the financial research community is facing now a days and why the field is turning into a mass of “cold fusion claims” where it is becoming more and more difficult to advance towards true knowledge. On today’s post I am going to talk about the current state of empirical finance, the huge problem of p-hacking, what you should be aware of and how you might want to face research in finance in order to avoid going into “rabbit holes” created by misuse of the scientific method. I will also give you some advice on how to avoid this in your own research so that you can avoid p-hacking your own results.

Selection_999(067)

What is the problem with research in quantitative finance? The field we work in has two extremely important characteristics that make it very unique when compared to other research domains. The first is that quantitative finance relies extremely heavily on hypothetical testing to get results – seeing how things would have worked in the past – and the second is that there is an incentive to avoid sharing knowledge as the sharing of knowledge may both invalidate it towards the future and negate a monetary gain for the person who had it in the first place. But why do these two issues cause such important problems when going into actual research?

The use of hypothetical testing has an important problem: the entire process is not available to those who review research articles. For example let’s suppose that I performed a search process through 1 billion trading systems and found 1 that worked very well. I then look into that single one, look at the systems around it and design a search process so that I get the system that I found plus another say, one thousand that look rather similar. When I write my research article I don’t say I searched for a billion systems first but I only say I searched for a thousand and I makeup some ad-hoc reason for that selection. When looking at the probability that my results come from chance from just that one thousand system search the statistical hypothesis tests tell me that my p value is really low, therefore the article reviewers can rest assured that I didn’t find that just out of chance.

Selection_999(068)

What I just did is p-hacking. Since I obtained a result that in reality required 1 billion trials but I told people I had only done one thousand I am in fact introducing a heavy bias that would probably make my results irrelevant as a whole. The case I just exposed above is crystal clear but the above may happen in many ways. Say you have been looking for a system for 5 years but you just found it when you started using some particular setup that you hadn’t thought of before. You may only consider the last few trials as relevant – you may even consider that you needed those years of experience to arrive at your result – but you are also doing a subconscious form of p-hacking where your results appear relevant only because you are ignoring all the previous research efforts you carried out. All that previous research might just contain the number of trials expected to arrive at the end result just by chance.

The pressure to be able to publish research for people who work in formal quantitative research does not make this any easier. If you want to publish research you need your p values to be low and this may involve forms of conscious and unconscious p-hacking where the relevance of repeated testing is diminished in order to accommodate the needed p values. Since you cannot get into someone’s head to know how many trials they did – maybe not even they know – it is fairly impossible to know whether the reported statistical tests and p values reported within financial research articles are real or just the result of somehow crafted experimental methodologies that are filled with data-mining bias. This bias could be attenuated by accounting for the number of trials and duplicating all research efforts on random data but this can become fairly impossible as neither is the number of trials always known or the entire research methodology strictly reproducible.

Selection_999(069)

Another important problem is that the lack of incentive to share knowledge causes quantitative finance to be a “lonely wolf” field which is really strange in other scientific domains. While average papers in fields like physics and chemistry may have 4-6 authors in quantitative finance there are rarely more than 2 and a wide variety of papers are published by single people. This means that no other party can account for the research methodology and no one is able to guard the author against subconscious p-hacking. Working in teams has great advantages since it helps establish a methodology that is far more reproducible and that can sometimes avoid very important methodological pitfalls that would be easy to fall prey to when conducting research alone. While a lone researcher might be able to “fool” him or herself a team of researchers is much more likely to see any bias issues and promptly correct them. Dishonest behavior – conscious p-hacking – is also much rarer when working in groups.

If you want to avoid these problems in your own research I would suggest you establish a clear research methodology and then have your methodology – you might not even have to share any results – examined by some peers. If you are doing things in a manner that may lead to excessive amounts of bias your peers might be able to let you know. Of course it is much better if you have collaborators who will take all the steps with you as working with others usually ensures that they put a much deeper share of effort since their skin is also in the game. You will also want to do confirmatory research and have peers reproduce your results without your intervention in order to ensure that your methodology is reproducible. If you would like to learn more about establishing a research methodology for trading and how you too can create systems with a statistically sound process in mind please consider joining Asirikuy.com, a website fiAsirikuy.comlled with educational videos, trading systems, development and a sound, honest and transparent approach towards automated trading.

Print Friendly
You can leave a response, or trackback from your own site.
Subscribe to RSS Feed Follow me on Twitter!
Show Buttons
Share On Facebook
Share On Twitter
Share On Google Plus
Share On Linkdin
Share On Pinterest
Share On Reddit
Share On Stumbleupon
Hide Buttons