[D] Classifying malware based on API calls
I am new to machine learning and after trying out TensorFlow’s tutorial on how to create a classifier based on IMDb reviews, I want to create my own classifier to actually do a binary classification(malicious/benign) of maybe .exe or .apk files.
I was wondering if I can actually proceed to do the same thing as what tensorflow’s IMDb tutorial did, i.e train using a set of text + give those text a label (pos/neg).
So in the context of classifying malware, those texts are actually system API calls. i.e
Set 1 [ func1() func2() func3() func4() func5() func6()…etc] Label -> Malicious
Set 2 [func1() func3() func4() func5()] Label -> benign
Sequence of the API call matters btw and i heard to do that I will need to use RNN LSTM.
I would love to hear from you guys if this is the correct way to do things…would most likely target Android applications…