[R] Classifying nodes in a Knowledge Graph by inducing a decision tree of discriminative walks
While deep learning and embedding techniques are getting increasingly popular for tasks related to (knowledge) graphs, they often suffer from being not interpretable, which is key in critical domains such as health care. We propose a simple technique called KG Path Tree which is competitive to current state-of-the-art while being interpretable (we compare it to RDF2Vec and (Relational) Graph CNN).
A KG Path Tree is a single decision tree in which each internal node tests for the presence of a certain walk in a sample’s graph neighborhood. Our walks are of a specific form: a walk of length `l` starts with a root, followed by `l – 2` wildcards (`*`) and then a named entity. An example could be: `root -> * -> * -> * -> Ghent` which would match the walk `Gilles Vandewiele –> studiedAt –> Ghent University –> locatedIn –> Ghent` when classifying `Gilles Vandewiele`. The final decision tree can then be used for classification of unseen samples. The path from the root to the prediction can easily be displayed (local explanation) and the model can be inspected (global explanation).
All code can be found on Github.