Skip to main content


Learn About Our Meetup

5000+ Members



Join our meetup, learn, connect, share, and get to know your Toronto AI community. 



Browse through the latest deep learning, ai, machine learning postings from Indeed for the GTA.



Are you looking to sponsor space, be a speaker, or volunteer, feel free to give us a shout.

[P] How can I build this simple text-based ML tool?

Hello everyone!

I work with spreadsheets a lot, doing tasks manually that are just a bit too complex for rules, but I believe they certainly fall into what ML can handle. In a nutshell, I spend 2+ hours a day going through company names, removing legal terms like “LLC” or “Limited”, and humanizing them.

For instance, I have a spreadsheet with company names and emails.

Company Name Email Address
Concur Recruitment Limited – 02476 668 204
Confluent Technology Group
Construction Maintenance and Allied Workers

These would become (currently by hand):

Company Name Email Address
Concur Engineering

What we’re doing here is:

  1. Shorting names to their essence
  2. Removing legal terms and words
  3. Looking at domain names (in email addresses) as a clue for the “most human name”

Now, I very well believe this is something Google Cloud has capabilities for. Given the lack of programming involved with Google Cloud ML (and its potential integration with Google Sheets), I’d imagine it’s the best vehicle for this tool.

Some questions before I embark upon this journey:

  1. Would your recommend I use Google Cloud ML or another tool?
  2. How much data would you imagine would be necessary to train this tool? (uncleaned spreadsheets and cleaned spreadsheets)
  3. Am I critically misunderstanding something here? This is pretty much my first time practically applying ML.

Thank you very much for all your help!

submitted by /u/ventura__highway
[link] [comments]