Now, let me clearly say that the Yahoo article is indefensible for a number of reasons in my opinion (to mention a few: way too small sample size, no robustness analysis, no mention of numbers of trials that were run), so in this I agree with Mr Litle.

In general, to be able to make a prediction with some value one has to identify certain features (variables) that combined in a certain way have some predictive power over future events.

agree AI data discretionary trading driven find human prediction pure reasoning relationships true “understanding”

In today’s markets dominated by High-Frequency algos, room for profits for non-HF (and more importantly, non-HF aware) guys is generally speaking reduced.

However, in my experience this doesn’t have to be necessary the case: simply put, as in any business you have to adapt to the competitors and in this case one way of doing it is to pay more attention and improve the execution side of your trading.

algorithms bigger executed HF holding matching matching algorithm performance proportional shorter smaller trade

Lately I have been looking for a more systematic way to get around overfitting and in my quest I found it useful to borrow some techniques from the Machine Learning field.

Expanding what discussed here (and here), it seems intuitive that the more features in a model, the more generally speaking the model might be subject to overfitting.

feature model obvious position possibly puts season sports statement water

This is a quick follow-up on my previous post on Quantile normalization.

Something else to note is that if your performance measure makes use of std deviation (as it’s the case for Sharpe Ratio), trimming the tails of the returns from its computation is likely to result in an overestimation of the performance.

First of all I like the [term] over-fitting rather than curve-fitting because curve-fitting is a term from non-linear regression analysis. A trivial example of underfitting could be buying a random stock from the stock universe at a random point in time and holding it for a random time period.

data driver fit fitting freedom market model over random relationship underfitting working

In one way or another, trading is mainly about predicting the future from the past and the main question is to know how likely our bet is to be successful.

What this post is really about is not so much describing exactly how to get an answer to overfitting, but rather trying to understand what the question we are asking really is and what shape we can expect the answer to have.

answer behavior data educated future guess historical obvious overfitting waiting

A couple of days ago I was reflecting on what the moving average of a price really is.

In the example above, the next closing price would be P6, and of course we can’t use the information given by P6 to trade on day 6.

agree data effectively ma cross misleading moving occurred oldest system transformations weighted

When trying to analyse market data it is common practice to use techniques borrowed from different fields to transform the data. Examples of these are probability distributions of price returns, moving averages, autocorrelations, rolling volatility estimates, data mining techniques and pretty much any kind of operation we use to treat the data/build an indicator.

actual autocorrelations averages data distributions examples market moving noise returns

To understand whether a strategy is able to perform in the future, the first question to ask is probably whether our strategy really showed great performance in the historic back-test or all it was doing was just describing past data accurately.

Of course depending on the trading strategy in use, you may want to change the actual percentage value (if you are using a MR strategy, usually having high win %, you might want to use an higher threshold), but you get the idea.

chosen find fitness full including normalization powerful quantile statistics strategy

Without exaggeration, diversification is one of the most powerful tools of a trader, allowing to enhance many characteristics of a system with relatively little effort.

From these, I then created a “diversified system” composed of the two sub strategies.

aim capital comparable diversification improvement matter ratio sharpe strategy things