The Numerai hedge fund: Paying anyone who can lend a crystal ball

Do you have data-mining skills that no one is paying you to use? Do you have a GPU cluster that is just sitting in the corner because your last machine learning job has already finished? Do you want to make some extra cash using your absurdly good featuring engineering skill set? If you are a professional or hobby data-scientist and you want to make some additional money in finance – without knowing anything about trading – then the answer is certainly Numerai, a new hedge fund idea that seeks to pull together the power of individuals in order to create better and better machine learning strategies to trade the financial markets. On today’s post I want to talk about Numerai, how it works, what my experience with it has been and how you too can join this effort to test your machine learning skills.


Numerai is a strange project. Instead of deciding to hire data-scientists to build models for their private use the team at Numerai decided that it would be better to simply open up the curtains and let anyone attempt to make valuable predictions. They decided that the valuable aspect of machine learning is not within the models themselves but in the predictions that come out of models that can actually predict something with at least some valuable accuracy. This means that at Numerai they are not interested in your models – in your crystal ball – but they are simply interested in what your crystal ball has to say. You can have it hidden and never share it but just share those glimpses of the future that can make the people at Numerai some dough.

Of course the problem with the above approach is that you cannot share the inputs used to make the predictions or the people with the crystal balls can simply sell their predictions to someone else or use them themselves. For this reason Numerai created an encrypted data set so that they can share data to get useful predictions but the people with the models cannot find any valuable use for these predictions beyond giving them to the people at Numerai. The people at Numerai then pay the people with the top predictions who can even be making 40K USD per year. Although for sure most of the people are making well below that, with most of the people who make some money – actually most do not make any – making somewhere between 100 and 300 USD per year.


The cost savings are brilliant. Instead of having data-scientists who you have to pay probably 150-250K per year at least, you are effectively paying most people pennies on the dollar to spend their time and money to bring you valuable predictions. The reason why this works – which is very interesting – is because it plays on a fundamental human instinct to be better than others and to be better than oneself, to prove to yourself that you can actually beat a challenge that many others find hard to do. This is why you’re probably willing to spend a lot of time in the Numerai problem, because you want to learn what makes that data tick and then be able to say “look at what I achieved against so many others”. You cannot get that sense of competition from most jobs, here you compete against probably some brilliant professional data-scientists, academics and hobbyists against the same data.

The process if you want to participate is very simple, signup, download the training data – which is clean (no missing values) – and has correctly labeled inputs and then use it to train a machine learning algorithm that you then use to make a prediction over a testing set (a set of unlabeled inputs) and upload it to the Numerai website. Only machine learning performers in the top few percentile are able to control capital while some other models are also able to control capital not in virtue of their low logloss – which is how they rank your performance – but by virtue of their model’s originality (which they do not define but I guess it just means that your model’s predictions have very low correlations to others). You can tell these models by the blue dot next to them and their much higher earnings per year relative to their neighbors.


The above are some of the results that I have been able to obtain at Numerai, currently I am close to controlling capital but have yet to polish my work to get there. It is relatively easy to get to the top 200 but it starts to get really hard to go below the top 100. In a future post I will talk more about how to process and make predictions using the Numerai data so that you too can give this a try if you want to test your data-mining skills. If you want to learn more about machine learning in trading and how you too can build strategies to trade using continuously trained algorithms please consider joining, a website filled with educational videos, trading systems, development and a sound, honest and transparent approach towards automated trading.strategies.

Print Friendly, PDF & Email
You can leave a response, or trackback from your own site.

Leave a Reply

WordPress › Error

The site is experiencing technical difficulties.