[R] Study on 299 data sets shows that non linear SVMs/ANNs outperform linear SVMs/ANNs only in 20 per cent of cases
When do non-linear versions of algorithms such as SVMs or neural networks outperform linear methods at a statistically significant level? We researched this question by running experiments on 299 data sets in OpenML. Results show that only in around 20 per cent of cases non linear results are better at a significant level. We also investigated this question deeper by looking at for what type of data sets this happens by looking at number of instances, features and building meta learning models.
Benjamin Strang, Peter van der Putten, Jan N. van Rijn and Frank Hutter. Don’t Rule Out Simple Models Prematurely: a Large Scale Benchmark Comparing Linear and Non-linear Classifiers in OpenML. In: Seventeenth International Symposium on Intelligent Data Analysis (IDA), 2018