Skip to main content

Blog

Learn About Our Meetup

5000+ Members

MEETUPS

LEARN, CONNECT, SHARE

Join our meetup, learn, connect, share, and get to know your Toronto AI community. 

JOB POSTINGS

INDEED POSTINGS

Browse through the latest deep learning, ai, machine learning postings from Indeed for the GTA.

CONTACT

CONNECT WITH US

Are you looking to sponsor space, be a speaker, or volunteer, feel free to give us a shout.

[D] Ways to classify text with very low training data?

What techniques does this group feel works best when classifying text with low amounts of training data?

I ask because I recently put together a tutorial that shows how to use TensorFlow Data Pipelines and NLP classification (BERT) and it gets 85% accuracy, but it tends to work best when there are at least 200 examples of a particular class.

I am uncertain if this technique will work if I only have 1 or two examples of training data for a class. For example, it is uncertain if the same approach would be as effective if I have a piece of text that says “I was in a line today for 3 hours”, if I only have 1 or two examples of that text, and if I am trying to classify this as “Long wait times”. Building on what I was saying earlier, I think that this problem made worse when looking at engineering text or text that is specific to a corporation (where it would be difficult to generate the examples or to get Mechanical Turk workers to classify the examples correctly).

What are your thoughts on this? Have you seen better ways to classify text when there are low amounts of data?

submitted by /u/ThinkCritically
[link] [comments]