[R] Research roadblock. Help with Extreme Multi-Label Classification
I have been working on the BioASQ challenge, Task A which is the large scale semantic indexing of PubMed abstracts. It is supposed to be my Master’s thesis but I have hit a roadblock.
The current state-of-the-art results, that is if we concern ourselves with just the micro-f score, is 68.8% while I can’t seem to get past the 60% mark. I am currently using pre-trained bio-medical FastText word vectors with a bidirectional GRU, the output of which branches out into two parts. The first part computes a document vector using attention mechanism while the second part applies a CNN and then k-max pooling to get yet another document representation. Both vectors are merged along with some additional hand-crafted features which are then finally fed to the output layer which is of size 28,472 (the total number of labels) with sigmoid activation and binary cross entropy loss. Upon training this architecture on 3 million abstracts, I am getting a micro-f score of 58.2%.
I have tried a number of other methods and architectures but none are working. It is extremely frustrating since I have made absolutely no progress for the entirety of this month and I am growing anxious with every passing day as my deliverable deadline keeps coming closer. It would be of immense help if anyone could point me in the right direction on how to proceed further. What to read, what to change, etc. I did read about Label wise attention networks but cannot understand how to implement that in Keras. A small hint or some pseudocode would be of great help.