[D] Concat Model with fast.ai for Metadata Enhanced Text Classification
My team is working on nontrivial multiclass text classification problems involving noisy datasets and usually more than 20 different target labels. We needed a robust way to combine both text features (ideally, leveraging some kind of pretrained word vectors) and extra-linguistic metadata.
So we figured out a Concat Model based on fast.ai that combines both ULMFiT for text and categorical or continuous input features to perform the task of classification. This article contains more details about the problem we’re solving and the results. Wanna take a look at the code? Check out this public Kaggle kernel and let us know what you think 🙂 Do you know other ways of exploiting both types of features?