Skip to main content

Blog

Learn About Our Meetup

5000+ Members

MEETUPS

LEARN, CONNECT, SHARE

Join our meetup, learn, connect, share, and get to know your Toronto AI community. 

JOB POSTINGS

INDEED POSTINGS

Browse through the latest deep learning, ai, machine learning postings from Indeed for the GTA.

CONTACT

CONNECT WITH US

Are you looking to sponsor space, be a speaker, or volunteer, feel free to give us a shout.

Category: Reddit MachineLearning

[D] No Free Lunch theorems do not compare functions that can utilize cost-information from partial solutions. So why care about NFL?

I see No Free Lunch theorems discussed enough that I decided to check my understanding, and sit down with the original paper.

They prove bold (but contextualized) claims, and I feel like the bold claims have really taken on a life of their own (absent context):

one might expect that hill climbing usually outperforms hill descending if one’s goal is to find a maximum […] such expectations are incorrect

the average performance of any pair of algorithms across all possible problems is identical

Very interesting, to be sure. But this all hinges on a specific assumption:

[…] our decision to only measure distinct [oracle] function evaluations

meaning:

techniques like branch and bound are not included since they rely explicitly on the cost structure of partial solutions.

I think their framework is interesting and useful for describing algorithms like Simulated Annealing or Genetic Algorithms.

But since it doesn’t apply to an entire class of algorithms (those that can reason from partial solutions), it seems to me that we should really reign in our claims about NFL.

I must be missing something.

submitted by /u/BayesMind
[link] [comments]

[R] Introducing the CodeSearchNet challenge

“Searching for code to reuse, call into, or to see how others handle a problem is one of the most common tasks in a software developer’s day. However, search engines for code are often frustrating and never fully understand what we want, unlike regular web search engines. We started using modern machine learning techniques to improve code search but quickly realized that we were unable to measure our progress. Unlike natural language processing with GLUE benchmarks, there is no standard dataset suitable for code search evaluation.

With our partners from Weights & Biases, today we’re announcing the CodeSearchNet Challenge evaluation environment and leaderboard. We’re also releasing a large dataset to help data scientists build models for this task, as well as several baseline models showing the current state of the art. Our leaderboard uses an annotated dataset of queries to evaluate the quality of code search tools.”

submitted by /u/youali
[link] [comments]

[D] Why do effective activation functions have a bounded derivative?

Is there a reason why almost every modern activation function in deep learning has a bounded derivative? ReLU, Swish, tanh, sigmoid and other activation functions mentioned here all have bounded derivatives.

My intuition says it is because we use backprop to train our networks. A bounded derivative should restrict the amount of gradient flow during the backward phase, preventing a blowup of gradients. What do you guys think?

submitted by /u/TheSilenceOfTheBakra
[link] [comments]

[P] Having a predefined questionnaire, how to write system to extract data.

There is an extremely inefficient process in my city office. There is a process of collecting a data from citizens each year, there is an online form and offline/paper form. The paper is a problem:

  1. The forms are given to the people.
  2. People fill the forms, it’s handwriting, and return it to the office.
  3. The clerks have about 2-4 weeks to type the forms into the system.
  4. There is a control data in the form, if incorrect, the form is ignored in further processing.

There are about 15-25K paper forms each year, the graphics and content changes yearly.

I have a template of this year’ form. It’s one page A4. There are two types of information we want to extract: small boxes for a single digit and free text boxes (can contain any text). I don’t have samples of data, but can generate few.

The forms contain sensitive data, cannot be processed outside the internal network. How would you approach such a problem? I would appreciate any help.

Usually I would just go with Google Vision API and text extraction and later writing decision tree to classify bounding boxes as a pieces of information, but in this case I cannot use external services.

This is a non-profit project. If I cannot solve it, they will just hand type it.

submitted by /u/janiedebica
[link] [comments]

[R] Deep Learning For Symbolic Mathematics (ICLR 2020 submission)

TL;DR: We train a neural network to compute function integrals, and to solve complex differential equations.

Abstract: Neural networks have a reputation for being better at solving statistical or approximate problems than at performing calculations or working with symbolic data. In this paper, we show that they can be surprisingly good at more elaborated tasks in mathematics, such as symbolic integration and solving differential equations. We propose a syntax for representing these mathematical problems, and methods for generating large datasets that can be used to train sequence-to-sequence models. We achieve results that outperform commercial Computer Algebra Systems such as Matlab or Mathematica. Keywords: symbolic, math, deep learning, transformers

https://openreview.net/forum?id=S1eZYeHFDS&noteId=S1eZYeHFDS

submitted by /u/youali
[link] [comments]

[D] – Finding research collaborators for medical AI projects outside of my university

As the title entails; how may I find researchers / engineers / students (MSc or PhD) to collaborate with on a project outside of the sphere of my own university? Is there any community where experts are able to network and form collaborations? I’m super eager to do some research on the side.

Already been offered to be involved with several projects at my university, but none of them interest me (mainly been related to either utilizing geometric DL for glaucoma or U-net for lung carcinoma) – and so I wish to look beyond this city.

More information, if relevant:

  • My domain expertise lies in healthcare/medicine, education, and gaming industry – but I am open to exploring other domains.
  • Since I know some people care; my university is ranked worldwide top <40 in engineering. My master’s degree is in machine learning, with a bachelor’s degree in medical engineering (I can verify this).
  • Due to affiliation with my universty, I am eligible to publish to arXiv without an endorsement from a registered author.
  • I work as a machine learning consultant and data science instructor alongside my studies.
  • I’m very familiar with other technologies, such as Kafka and its ecosystem, Flutter, and more.

To clarify; I am looking for, in particular, people that already know a lot of machine learning who wish to collaborate on a research project – whether it’d be deep reinforcement learning, bayesian deep learning, or whatever else. Hence why I didn’t feel like this would be appropriate to post at /r/learnmachinelearning

submitted by /u/Naveos
[link] [comments]

[D] Why does backtranslation work?

I think I must be misunderstanding how backtranslation, because I’m not seeing how this could help. I’ll describe my current understanding then I’ll ask my question.

The usual setup is that you have some some small set B of parallel data between a source and target language. Your goal is to make a model that a language in the source language and produced the translated version in the target language.

In addition to the small dataset B, you also have some potentially very large corpus A of monolingual data in the target language. In order to leverage this data, you train a model in the reverse direction i.e target to source, by using B with the entries flipped. Then you use this model to make A’, which consists of the translations of entries in A by using the reverse model. Finally, you add A’ to B, get some final set C which you then train source –> target model.

In some sense, this should only help if your target –> source model is good. However, you trained this model only on B. This raises the following questions:

1) if you can build a good target –> source model from just B, why can’t you do the same with source –> target?

2) If you do get some improvements, why can’t you continue this process again? i.e. Train the source –> target model using C, then grab some large monolingual corpus from the source language, backtranslate that to make some new set A”, then add A” to C and re-train the target –> source model then make more source –> target examples by backtranslating the new model? Rise and repeat till you run out of compute.

Finally, is there a good reference for this kind of stuff? Most papers which use backtranslation are extremely vague about it.

submitted by /u/TheRedSphinx
[link] [comments]

[P] Curated Papers (early release)

Hi all,

I’m launching Curated Papers, for the first time in this subreddit!

It’s a website that let you organize lists of academic papers, share curated lists and discover lists made by others.

Think that you need to get into a new field of study… so instead of manually researching what papers to read, going over references, juggling papers back and forth (research that can take a long time), you could instead discover a curated list on the subject, made by a researcher coming from this field.

Of course that it will not entirely replace your need to search for papers, but at least you’ll start with a good basis moving forward.

It’s also some sort of a social network built around curated lists and academic papers, you can for example, like, discuss and follow curated lists, academic papers, or other users and stay up to date with your interests.

I’ll be happy to get your feedback, did you like it? do you have any feature request?

Thanks!

submitted by /u/getlasterror
[link] [comments]

[R] What do you think about the idea of creating a random forest using DL for tabular data?

There seems to be no reason this is not possible. There’s a reason I think it’s good not only in concept but also in performance. It is not a good performance that is required of the models that consist of RF, but RF requires the models to be overfitting as possible and to be different from each other as possible. This seems to be possible enough for DL for tabular data.

The nice thing about this is that the model could deal with recursion on tabular data. DL is weak at processing tabular data and tree-based models cannot handle recursion. So it would be nice to be good at both.

I looked for a while but couldn’t find anything about this. I wonder if there’s anything I couldn’t find…

submitted by /u/SunghoYahng
[link] [comments]