[P] Open source library to perform entity embeddings on categorical variables using Convolutional Neural Networks [+ Unit Tests, Code Coverage and Continuous Integration]
In the past 2 years I have been working as a Machine Learning developer, mostly with tabular data, and I’ve developed a tool to perform entity embeddings on categorical variables using CNN with Keras. I tried pretty much to make it easy to use and flexible to most of the existent scenarios (regression, binary and multi-class classification), but if you find any other need or issue to be fixed, do not hesitate to ask.
I tried to add some cool stuff on the project, such as unit tests, code coverage with Codacy, continuous integration with Travis CI and auto deployment to PyPi and auto-generated documentation with Sphinx and ReadTheDocs, so if any of you is interested in how to setup your project to have these features, feel free to use it as a base project.
Looking forward to any reviews about the source code. Any tip to improve the readability or even performance, its really welcome and well appreciated.
Github: https://github.com/bresan/entity_embeddings_categorical
PyPi: https://pypi.org/project/entity-embeddings-categorical/
Code coverage (nowadays reaching 97%): https://coveralls.io/github/bresan/entity_embeddings_categorical?branch=master
Thanks and I hope it can help somebody out there 🙂
submitted by /u/CrazyCapivara
[link] [comments]