Skip to main content

Blog

Learn About Our Meetup

5000+ Members

MEETUPS

LEARN, CONNECT, SHARE

Join our meetup, learn, connect, share, and get to know your Toronto AI community. 

JOB POSTINGS

INDEED POSTINGS

Browse through the latest deep learning, ai, machine learning postings from Indeed for the GTA.

CONTACT

CONNECT WITH US

Are you looking to sponsor space, be a speaker, or volunteer, feel free to give us a shout.

Category: Reddit MachineLearning

[D] The Machine Learning Conference is next week – Free ticket code here

Hi All, as a thank you to this community, we’re giving away 5 free tickets to MLconf SF. The event is almost sold out so first to register gets the tickets: https://www.eventbrite.com/e/mlconf-sf-2019-tickets-52641374769

For free registration use code: slashmlfree

Note: The code will expire once the 5 free tickets are registered, they’re gone. First come first served. If the code stops working you may still be able to use to 50% off code: slashml19

*Please don’t share the code, we’re like these tickets to go to community members.

**If you’ve already purchased a ticket, you’re not eligible for a refund.

For video from past events see: https://www.youtube.com/channel/UCjeM1xxYb_37bZfyparLS3Q/videos

submitted by /u/shonburton
[link] [comments]

[R] You can find a lot of interesting things in the loss landscape of your neural network

[R] You can find a lot of interesting things in the loss landscape of your neural network

Just sharing with you a small (and somewhat fun) project I was recently working on, which is about finding different patterns in the loss surface of neural networks. Usually, a landscape around a minimum looks like a pit with random hills and mountains surrounding it, but there exist more meaningful ones, like in the picture below (check the paper for more results). We have discovered that you can find a minimum with (almost) any landscape you like. An interesting thing is that the found landscape pattern remains valid even for a test set, i.e. it is a property that (most likely) remains valid for the whole data distribution.

https://preview.redd.it/t885u6vosow31.png?width=1810&format=png&auto=webp&s=793644af78a5430368e7a1c05d7b38c6b02ec637

Paper: https://arxiv.org/abs/1910.03867
Code: https://github.com/universome/loss-patterns

submitted by /u/universome
[link] [comments]

[D] Thoughts on Quantum Artificial Intelligence / Q Supremacy

[D] Thoughts on Quantum Artificial Intelligence / Q Supremacy

“Quantum Computing: The Why and How ǀ Jonathan Baker, UChicago” https://www.youtube.com/watch?v=5kTiB_KDUj0

Hey 🙂 So just wanted to start a discussion on what people think about whether Quantum Algorithms will “revolutinize” machine learning algorithms. I’m not a quantum expert, so take my stance with a grain of salt.

I was watching many videos (“Quantum algorithm for solving linear equations” https://www.youtube.com/watch?v=KtIPAPyaPOg, “Seth Lloyd: Quantum Machine Learning” https://www.youtube.com/watch?v=wkBPp9UovVU etc etc) + reading Wikipedia blah. Then I came across the diagram above.

According to Jonathan Baker, there are 3 main future trends for QPCs. I just extrapolated his graphs. The green line is most optimistic, utilising “co-design”??? which I don’t know what that means. The red is less steep, and the blue is just a straight line continuation of the current # of qubit trend. (Notice the log10 scale)

QAOA or Quantum Approximation Optimization Algorithms include Quantum Linear Regression, and possibly??? (I don’t know) optimisation methods for backprop. The green line shows by 2025 QPCs can be used for Linear Reg. Red which is like average case is 2035? and worst case is 2045.

To crack cryptography (ie Shor’s Algorithm), over 10,000 # of qubits are needed. By green best case, that will be at 2032. Average is 2045 and worst is 2067.

I was also reading Wikipedia “Quantum algorithm for linear systems of equations” https://en.wikipedia.org/wiki/Quantum_algorithm_for_linear_systems_of_equations, and it highlights how solving X * beta = y or A * x = b takes O( log(P) * K^2 ), where K is the condition number and P is the # of coefficients in beta. The best conjugate gradeint method takes O( P * K ).

More concretely, the “exponential speedup” (I think???) applies to sparse matrices. If you include error bounds, then you get O( log(P) * K^2 / err ). For dense matrices you get O( sqrt(P) log(P) K^2 ).

The issue I see is since methods all include error bounds, it isn’t necessarily a good way to compare direct methods with quantum algos. A better way is to compare randomized methods, where an “exponential speedup” is also possible by sketching only log(N) rows. It’s possible to also say apply the randomized methods with Q Algos, hence in total you might get a staggering “exponential-exponential” speedup, but because Q Algos inherently have error, this will exaggerate the error a lot.

So what do people think about the potential of Q Algos for ML ?

PS: The graph above is suprisingly a log10 plot (ie x10). This is clearly different from Moore’s Law graph (x2 for # of transistors), but anyways I’m guessing Qubits don’t follow Moores Law

submitted by /u/danielhanchen
[link] [comments]

[R] Announcing Confident Learning: Finding and Learning with Label Errors in Datasets

Hi, Reddit. I’m excited to share confident learning for characterizing, finding, and learning with label errors in datasets. To promote and standardize future research in learning with noisy labels and weak supervision, I’ve also open-sourced the cleanlab Python package: https://pypi.org/project/cleanlab/

Post: https://l7.curtisnorthcutt.com/confident-learning

Title: Confident Learning: Uncertainty Estimation for Dataset Labels

Abstract: Learning exists in the context of data, yet notions of confidence typically focus on model predictions, not label quality. Confident learning (CL) has emerged as an approach for characterizing, identifying, and learning with noisy labels in datasets, based on the principles of pruning noisy data, counting to estimate noise, and ranking examples to train with confidence. Here, we generalize CL, building on the assumption of a classification noise process, to directly estimate the joint distribution between noisy (given) labels and uncorrupted (unknown) labels. This generalized CL, open-sourced as cleanlab, is provably consistent under reasonable conditions, and experimentally performant on ImageNet and CIFAR, outperforming recent approaches, e.g. MentorNet, by 30% or more, when label noise is non-uniform. cleanlab also quantifies ontological class overlap, and can increase model accuracy (e.g. ResNet) by providing clean data for training.

Paper: https://arxiv.org/abs/1911.00068
Code: https://github.com/cgnorthcutt/cleanlab/

submitted by /u/cgnorthcutt
[link] [comments]

[P] Machine Learning for Excel. Anyone interested?

Hello everyone! I’m looking into building a machine learning Excel Add-In. Here is how it works (roughly):

  1. Prepare a sheet of training data, one of the columns contain the target or label, other columns are features.
  2. Open the add-in. Select relevant columns for training. For each column, choose whether it is categorical or numerical.
  3. The data is submitted to a cluster of servers and the servers automatically try different types of models and hyperparameters to produce the most accurate results.
  4. Then the user can use the add-in to make predictions on new data in another Excel sheet.

Would this be useful for people who don’t know how to train machine learning models?

submitted by /u/DomLiu
[link] [comments]

[D] relation between the learned parameters of two trained neural networks on the same dataset

I was wondering if there is any work that studies the relation of learned weights between two neural nets.

For example, suppose we have a simple regression task, and we trained an MLP with one hidden layer with 20 neurons. If we train another MLP with 15 neurons in the hidden layer, what would the relation of the weight matrices be between these two networks?

I found some related works on neural network compression literature that start with the bigger model and use matrix pruning with factorization and/or decomposition to reach a smaller model. But, I’m not sure if the obtained parameters will be close to the weights a neural network(with the same parameters as the smaller model) will learn if trained from scratch. I mean, the fact that we can use pruning methods and get good accuracy doesn’t necessarily mean that that is the true relation between the bigger model and the smaller one. What do you think?

submitted by /u/nodet07
[link] [comments]

[N] PyTorch-NLP 0.5.0 Released! Heres to contributing back to open source, hurah! 🤗

Hi There! 🍪

2 years into PyTorch-NLP and another 6 months from the previous release, I am releasing PyTorch-NLP 0.5.0. Also, with your help, we’ll break 50,000 downloads! Thank you 🙂 I love helping the community because I myself benefit from the hard work of other open source contributors!

As always, the theme of PyTorch-NLP is to be small, extensible and intuitive much like PyTorch is! And the goal is to extend PyTorch with basic NLP utilities.

Here are the release notes highlights: 🐕

Python 3.5 Support, Sampler Pipelining, Finer Control of Random State

Major Updates

  • Updated my README emoji game to be more ambiguous while maintaining fun and heartwarming vibe.
  • Support for Python 3.5.
  • Extensive rewrite of README.md to focus on new users and building an NLP pipeline. See here.
  • Support for Pytorch 1.2.
  • Added torchnlp.random for finer grain control of random state building on PyTorch’s fork_rng. This module controls the random state of torch, numpy and random. “`python import random import numpy import torch

from torchnlp.random import fork_rng

with fork_rng(seed=123): # Ensure determinism print(‘Random:’, random.randint(1, 231)) print(‘Numpy:’, numpy.random.randint(1, 231)) print(‘Torch:’, int(torch.randint(1, 2**31, (1,)))) - Refactored `torchnlp.samplers` enabling pipelining. For example: python from torchnlp.samplers import DeterministicSampler from torchnlp.samplers import BalancedSampler

data = [‘a’, ‘b’, ‘c’] + [‘c’] * 100 sampler = BalancedSampler(data, num_samples=3) sampler = DeterministicSampler(sampler, random_seed=12) print([data[i] for i in sampler]) # [‘c’, ‘b’, ‘a’] - Added `torchnlp.samplers.balanced_sampler` for balanced sampling extending Pytorch's `WeightedRandomSampler`. - Added `torchnlp.samplers.deterministic_sampler` for deterministic sampling based on `torchnlp.random`. - Added `torchnlp.samplers.distributed_batch_sampler` for distributed batch sampling that's more extensible and less restrictive than PyTorch's version. - Added `torchnlp.samplers.oom_batch_sampler` to sample large batches first in order to force an out-of-memory error earlier rather than later into training. - Added `torchnlp.utils.get_total_parameters` to measure the number of parameters in a model. - Added `torchnlp.utils.get_tensors` to measure the size of an object in number of tensor elements. This is useful for dynamic batch sizing and for `torchnlp.samplers.oom_batch_sampler`. python from torchnlp.utils import get_tensors

randomobject = tuple([{‘t’: torch.tensor([1, 2])}, torch.tensor([2, 3])]) tensors = gettensors(random_object) assert len(tensors) == 2 “`

submitted by /u/Deepblue129
[link] [comments]