[R] GENDIS: GENetic DIscovery of Shapelets (to classify timeseries)
In the time series classification domain, shapelets are small subseries that are discriminative for a certain class. It has been shown that by projecting the original dataset to a distance space, where each axis corresponds to the distance to a certain shapelet, classifiers are able to achieve state-of-the-art results on a plethora of datasets. In order to find these shapelets, the current state-of-the-art (in terms of predictive performance) performs a brute-force search that quickly becomes intractable for larger datasets.
Therefore, we propose a genetic algorithm that searches for an entire set of shapelets directly. This results in a more scalable algorithm that is competitive to the current state-of-the-art. Moreover, the number of shapelets needed to achieve this competitive performance is several orders of magnitude smaller than the current sota.
The implementation follows the sklearn interface. You can just simply use `fit` and `transform` methods. There are docs.
You can find the code (with an example notebook and tutorial) on Github.
submitted by /u/givdwiel