Sensitivity, Specificity and Return improvements in our Random Forest model

On my last post I discussed some of the characteristics of our RandomForest classification model for the prediction of OS success and why this model is able to improve average out-of-sample returns in testing sets despite being a rather mediocre classifier in its own. Today we are going to go a bit deeper into this model and how using a threshold parameter can change the specificity, sensitivity and how effective the model predictions are in actually improving average returns. We will talk a bit about what these variations mean and why what apparently  might be a more solid classifier does not actually constitute a better choice for the actual improvement of the average return in the OS. Thanks to Fd for suggesting this topic and for sending me relevant links about the relationship between these variables and probability thresholds.


When talking about classifiers there are usually two variables that are very interesting to study. The sensitivity is how likely your classifier is to classify something as positive which in reality is positive while the specificity is how likely you are to classify something as negative if it indeed is negative in reality. In the case of classifiers used to predict trading profitability a high sensitivity means that you have a very high chance to classify things that are profitable as profitable while a high specificity means that you are able to classify things that lose money as losers. Usually you can’t have the best of both worlds in complex problems, either you classify almost everything that is positive correctly and you end up with a lot of errors classifying negative outcomes or the other way around.

In general we can tune the relationship between the sensitivity and the specificity by changing the probability threshold that a classifier uses to make a decision. Classifiers like random forests can give us the probability that a solution is placed within a specific class and we can tune this decision boundary in order to affect the specificity and the sensitivity. For example instead of assigning a system to “profitable” if the probability to be profitable is 50% we can make it so that we only assign it if the probability is above 55% or 60%. This means that we have effectively moved the decision boundary such that we will tend to classify more profitable systems as unprofitable. This means that we will tend to have a lower sensitivity if the probability of classifying correctly remains constant as we will have less samples from the overall population within the profitability class.


The first image above shows you what happens when we perform this exercise. If we vary the threshold from 0.3 to 0.6 we can see how the sensitivity and specificity changes and we can also see how the relative improvement in the average daily return changes as well. It is evident from this that having a high sensitivity is not useful for our classifier – as this leads to very small improvements – while going above the 95% mark in specificity also does not provide us with any improvements. From this curve it is evident that the best improvement in OS returns happens at a threshold of 0.5 which is where we have a sensitivity close to 8% and a specificity of about 93%.

What this means is that improving returns relies more strongly on classifying losers correctly than on classifying winners correctly. You have a better chance of improving your returns if you are able to avoid losers rather than if you’re able to correctly place winners. The reason for this seems to be that avoiding the worst false positives is more important since these bring very significant losses that cannot be compensated by the increase in the number of profitable systems that are classified correctly as such. When you have low sensitivity values with high specificities it seems that you are able to cut the worst part of the distribution of bad returns, which in turn allows you to have a big improvement in your overall out-of-sample returns. It is also no coincidence that the highest posterior probability is also at this threshold (at 0.5) this is also the point where our accuracy is the highest.


However note that increasing the specificity further – to close to 100% – actually tends to decrease your returns because at this point you are reducing the sensitivity so much that you barely have anything to trade and this as well makes your posterior probability drop significantly and sharply from its high near 50%. Of course the above are the results applied to testing sets but we have yet no idea if these statistics will really hold when we take them to live trading under completely new conditions. If you would like to learn more about this sort of model and how you too can use it for trading decisions please consider joining, a website filled with educational videos, trading systems, development and a sound, honest and transparent approach towards automated trading.strategies.

Print Friendly
You can leave a response, or trackback from your own site.

Leave a Reply

Subscribe to RSS Feed Follow me on Twitter!