One problem that can arise when using a Supervised Learning Model is know as overfitting. When backtesting a strategy, we will use historical data to see how the strategy worked in the past. By testing different methodologies, an optimal strategy can be found and used to make predictions on new data. A poorly designed backtest can create a false sense of security and may not perform well in production. Overfitting occurs when we do too good of a job of modeling our past data, so much so, that we have a difficult time making general assumptions about the future.
Today, Tom Sosnoff and Tony Battista are joined by Dr. Data (Michael Rechenthin, Ph.D.) from the Research Team as he explains all about overfitting. Dr. Data discusses how and when overfitting happens and what you can do in order to avoid it. He explains that a very simple model may look the best but when you add more complexity, such as additional variables, the risk for overfitting becomes greater. Dr. Data finishes up the discussion by explaining the necessary steps you should take to make sure you are not overfitting.