There is definitely “folk knowledge” that neural networks shouldn’t be used for very small datasets.

At the same time, there is a strong incentive to always use deep learning, because then you are doing work in applied artificial intelligence, which means you are very cool and smart.

And like in the paper, it will probably sort of work okay, just not as well as a simpler baseline, and with a big waste of computational resources.

]]>And as for the limits of the M3 competition, there will be an M4 competition in December open to anyone.

]]>Maybe every methods paper (mine included) should be required to include a section called, Problems Where Our Method Won’t Work. Not just a Limitations section giving a bunch of hypothetical objections, but a list of the sorts of problems where applying the method will give a bad answer.

]]>It does seem like there’s a trend among the young people these days to just throw deep nets at everything, even low-dimensional tabular datasets. The wiser millennials, though, know that you should also try gradient boosting.

]]>First, their choice of statistical methods and data was ridiculous. They chose “the six most accurate methods of the M3 Competition” for the statistical methods, i.e. methods developed explicitly for time series forecasting, for the statistical methods. Then they chose datasets *used in that same competition* for this comparison. (Talk about overfitting!) The datasets had between 81 and 126 observations – which is fine, but a exactly where we expect statistical methods to have less overfitting then ML.

Second, their choice of ML methods and parameters was at best strange. The number of researcher degrees of freedom is off the charts – the packages used, the choice of parameters to use (they often used the defaults, but sometimes didn’t. They sometimes justified these departures with citations, but sometimes did not.) The preprocessing was chosen using a strange heuristic that assumed all ML models should use the same pre-processing, based on what worked best for one method.

Finally, the ML methods they used were not appropriate ones for the comparison. Not only did they not choose methods that were particularly suited to time-series forecasting, they didn’t even use anything like the current generation of those model types. As an example, for two methods they used the RSNNS package, a wrapper for SNNS. The current version of SNNS seems to be from 1995(!!!!) and doesn’t support standard methods like ReLU for the mlp or rbf methods.

]]>https://arxiv.org/abs/1806.06850

which shows that some NN’ s are essentially polynomial regressions, and polynomial regressions have gone out of favor for many good reasons. There was also a study with medical records (I can’t find the link) that had I believe a logistic regression outperform all the ML algorithms.

There are clearly a lot of good ideas in machine earning algorithms, but in my view a lot of exaggerated claims and lack of understanding of why they work or don’t work , and their limitations. Even more one of the big claims is that understanding doesn’t matter, just the ability to forecast, but if they don’t even do that all that well, then ….

But I give them credit – the ML field comes up with the best names.

]]>