Skip to main content

Blog

Learn About Our Meetup

5000+ Members

MEETUPS

LEARN, CONNECT, SHARE

Join our meetup, learn, connect, share, and get to know your Toronto AI community. 

JOB POSTINGS

INDEED POSTINGS

Browse through the latest deep learning, ai, machine learning postings from Indeed for the GTA.

CONTACT

CONNECT WITH US

Are you looking to sponsor space, be a speaker, or volunteer, feel free to give us a shout.

Category: Reddit MachineLearning

[D] Named tensors in the new PyTorch version – what are the advantages compared to tsalib?

In the new PyTorch version, there is experimental support for named tensors, which looks like a big deal for example when vectorizing a pipeline or something of the sort. The idea has been floating in the community for a while, I think it will greatly help with axis bugs. What I am not sure is what advantages does it bring compared to, say tsalib?

https://pytorch.org/docs/stable/named_tensor.html

http://nlp.seas.harvard.edu/NamedTensor

https://github.com/ofnote/tsalib

submitted by /u/dev-ai
[link] [comments]

[D] Why is L2 preferred over L1 Regularization?

I understand L1 regularization induces sparsity, and is thus, good for cases when it’s required.

But In normal use cases, what are the benefits of using L2 over L1? If it’s just that weights should be smaller, then why can’t we use L4 for example?

I’ve seen mentions of L2 capturing energy, Euclidean distance and being rotation invariant. Could one explain these more explicitly as to how this happens?

submitted by /u/tshrjn
[link] [comments]

[D] Learning with “noisy data” (but perfect labels)

There are many works that deal with noisy labels, but has the problem of unreliable data (but reliable labels) been studied? In other words, problems where the data to be classified is imperfect and not always sufficient to determine the class label.

An example would be a model that predicts the city in which a photo was taken. Ground truth labels would be perfect thanks to GPS metadata. If the photo contains the Eiffel Tower, we can predict that the city is Paris. But many pictures contain no useful information; for example a photo of a dog or a McDonald’s is nearly useless for determining the city.

How best to train a classifier when such “noisy examples” (for lack of a better term) are very common?

submitted by /u/viviandefeater
[link] [comments]

[R] Looking for an ML platform that also allows for integration with business users?

I have the following overall requirement for an ML platform:

  • Ability for the data engineering team to build pipelines and integrate with ERP apps (hybrid on-prem/cloud), and run and monitor models in production, and store results of the models.
  • Ability for the data science team to perform EDA and run experiments, and then push models to production as needed.
  • Ability for business users (who are not technical and do not know how to code) to interact with the data, not just view reports: Perform Excel like calculations, override predictions that they disagree with, run what-if simulations, etc….and then commit any changes back to the data store.

I have seen multiple ML platforms that provide the first two components, but the business user part is always just a dashboarding capability, not a real interface like the one I described.

Does anybody provide anything like this?

submitted by /u/AlexSnakeKing
[link] [comments]

[R] [D] NLP, Any papers on text summarization on very long (arbitrary length) text?

Hi, I’m catching up on the text summarization scene and most of the papers I have seen are using the CNN,newsroom,xsum datasets; but the max document size for any of these seem to be ~1000 tokens. Are there any papers that deal with very long (or arbitrary) document lengths?

As I understand it, most of the SOTA now is transformer based and they are bound by the # of positional embeddings in use.

submitted by /u/natural_language_guy
[link] [comments]

[D] What to expect from technical case study interview?

I just got an offer for a phone interview for a machine learning intership and part of it is a technical case study. I have looked up examples online and they seem complex. All of my knowledge of machine learning is self taught and more hands on so I really only know HOW to apply machine learning techniques and don’t know much about WHEN to apply them. Can people provide some things interviewers are looking for in this case study and perhaps some material I should learn before hand. Thank you!

submitted by /u/the_lonk55
[link] [comments]

[P] Ranking data handlers based on statistics

I have a problem that I’m looking to solve and want some direction.

I have data that I need to process and I have handlers that I need to rank. The handlers are essentially people that may be able to process each piece of data. Each piece of data that comes in is sent to the handlers for them to process. If they are unable to process it, then I send it to the next handler. I do this until one of the handlers successfully processes it or I have exhausted all the handlers.

I have several criteria that I’m using to rank the handlers. For example, how frequently they successfully handle a piece of data, how long they take to process the data, and a score for how well they handled it (successful solutions can be graded and although any is acceptable, we would prefer the one that produces the better answer).

Given a bunch of data with the above statistics, I would like to do two things. First, produce a report that ranks the handlers. This is currently done manually so I would like to automate this step. Second, I would like to have the dispatcher respond in real time to changes in the statistics. For example, if one of the handlers starts taking longer than normal then we should deprioritizing subsequent requests to that handler until they improve.

Are there recommendations for a toolkit or a subset of algorithms that I should be researching? Any pointers are appreciated.

submitted by /u/reddof
[link] [comments]

[D] Meta-learning for fast convergence for training from scratch?

Meta-learning is good for learning new class with <10 samples.And it requires sort of pre-training with similar classes.

Is there good recent works to improve convergence for randomly-initialized networks using meta-learning? Last time I looked into https://ai.google/research/pubs/pub46116/ and rejected work at openreview,
So far results are worse than SGD and Adam. Or maybe ~0.1% faster convergence but consumes ~30% more computations.

submitted by /u/tsauri
[link] [comments]