[P] How to easily deploy XGBoost models in C++ production environments
Hi all!
I want to wish you a good evening and would like to share with you some tool I developed for my work in the particle physics community.
Probably we are not the only ones who like to prototype and train ML models with xgboost in Python and then have to deploy them to our high performance multithreading C++ production environment. This is exactly the step that my tool which I call “FastForest” wants to make easy like a breeze!
The mission of the library is to be:
- Easy: deploying your xgboost model should be as painless as it can be
- Fast: thanks to efficient structure-of-array data structures for storing the trees, this library goes very easy on your CPU and memory
- Safe: the FastForest objects are immutable, and therefore they are an excellent choice in multithreading environments
- Portable: FastForest has no dependency other than the C++ standard library
I hope that this might be of use for someone else too 🙂 I think it might, because so far, the only solutions to this C++ deployment problem that I found on the web are either to use the sparsely documented xgboost C API or to transform your models into hardcoded C++ (which is cool as well of course). What I tried to write here is a very lightweight and clean C++ solution which can load models dynamically.
Check it out here: https://github.com/guitargeek/XGBoost-FastForest
submitted by /u/jonas_aka_guitargeek
[link] [comments]