Skip to main content

Blog

Learn About Our Meetup

5000+ Members

MEETUPS

LEARN, CONNECT, SHARE

Join our meetup, learn, connect, share, and get to know your Toronto AI community. 

JOB POSTINGS

INDEED POSTINGS

Browse through the latest deep learning, ai, machine learning postings from Indeed for the GTA.

CONTACT

CONNECT WITH US

Are you looking to sponsor space, be a speaker, or volunteer, feel free to give us a shout.

[P] How can I build this simple text-based ML tool?

Hello everyone!

I work with spreadsheets a lot, doing tasks manually that are just a bit too complex for rules, but I believe they certainly fall into what ML can handle. In a nutshell, I spend 2+ hours a day going through company names, removing legal terms like “LLC” or “Limited”, and humanizing them.

For instance, I have a spreadsheet with company names and emails.

Company Name Email Address
Concur Recruitment Limited – 02476 668 204 sconvery@concurengineering.co.uk
Confluent Technology Group mark.anderson@confluentgroup.com
Construction Maintenance and Allied Workers donmelanson@cmaw.ca

These would become (currently by hand):

Company Name Email Address
Concur Engineering sconvery@concurengineering.co.uk
Confluent mark.anderson@confluentgroup.com
CMAW donmelanson@cmaw.ca

What we’re doing here is:

  1. Shorting names to their essence
  2. Removing legal terms and words
  3. Looking at domain names (in email addresses) as a clue for the “most human name”

Now, I very well believe this is something Google Cloud has capabilities for. Given the lack of programming involved with Google Cloud ML (and its potential integration with Google Sheets), I’d imagine it’s the best vehicle for this tool.

Some questions before I embark upon this journey:

  1. Would your recommend I use Google Cloud ML or another tool?
  2. How much data would you imagine would be necessary to train this tool? (uncleaned spreadsheets and cleaned spreadsheets)
  3. Am I critically misunderstanding something here? This is pretty much my first time practically applying ML.

Thank you very much for all your help!

submitted by /u/ventura__highway
[link] [comments]