Skip to main content

Blog

Learn About Our Meetup

5000+ Members

MEETUPS

LEARN, CONNECT, SHARE

Join our meetup, learn, connect, share, and get to know your Toronto AI community. 

JOB POSTINGS

INDEED POSTINGS

Browse through the latest deep learning, ai, machine learning postings from Indeed for the GTA.

CONTACT

CONNECT WITH US

Are you looking to sponsor space, be a speaker, or volunteer, feel free to give us a shout.

[P] Machine learning application to identify “risky” words

So I am doing a project to create a model that extracts words in a sentence that are related to risks (will be stored in an array after). I have a large set of data (around 27k lines).

An example of words: Injury, collision, police, hit, fatal, etc…

I am doing this with Python, Sklearn library. Any suggestions on how to approach this?

So far, I have achieved to apply TFIDF on the data and print each word with its relative TFIDF score, I’m not sure if this is usefull at all.

It does output the “risky” words, but it also outputs all other words that I do not need. The only way I can filter the risk words out is by typing them on a seperate file, and just compare word by word, but there is no machine learning in that, and I would really like to apply some sort of machine learning (maybe naive bayes?). I am willing to label some data if it helps and make this supervised instead of being unsupervised currently.

Any help is appreciated 🙂

submitted by /u/abdane
[link] [comments]