Skip to main content

Blog

Learn About Our Meetup

5000+ Members

MEETUPS

LEARN, CONNECT, SHARE

Join our meetup, learn, connect, share, and get to know your Toronto AI community. 

JOB POSTINGS

INDEED POSTINGS

Browse through the latest deep learning, ai, machine learning postings from Indeed for the GTA.

CONTACT

CONNECT WITH US

Are you looking to sponsor space, be a speaker, or volunteer, feel free to give us a shout.

Category: Reddit MachineLearning

[News] Microsoft’s Project Silica succeeds in storing Superman film on a piece of glass.

[News] Microsoft's Project Silica succeeds in storing Superman film on a piece of glass.

Zoomed image of 1978 film “Superman” stored in a piece of glass.

Cloud storage has now seamlessly entered into our digital lives to such an extent that we neither realize that we are utilizing it in many instances and nor do we comprehend the fact that all this data is being physically stored in hardware whose capacity in increasingly flattening. This problem is compounded by the fact that the amount of data that each of us generate is increasing exponentially.

Even if the infrastructure keeps up with this rising demand, the hardware such as disk drives itself have a lifespan of around five years. This means that to keep the data saved they have to be cyclically written on to newer hardware.

A unique solution thought of, was to use the same ultrashort optical pulses used in LASIK surgeries to store data in glass by permanently changing its structure. This can help keep the data saved for centuries. Quartz glass also doesn’t need energy-intensive air conditioning to keep material at a constant temperature or systems that remove moisture from the air – both of which could lower the environmental footprint of large-scale data storage.

Continue reading: https://latesttechnewswiki.blogspot.com/2019/11/microsoft-stores-superman-in-a-piece-of-glass.html

submitted by /u/Anirban_Hazra
[link] [comments]

[D] On the Difficulty of Evaluating Baselines: A Study on Recommender Systems

Here is the paper https://arxiv.org/abs/1905.01395. I read this paper recently. I found it is quite interesting. It points out some issues in this research field. In my view, the key claim is that we need standardized benchmarks and the whole community should converge to well-calibrated results. I didn’t find any discussions here. So I create this post and look forward to some discussions.

submitted by /u/jody293
[link] [comments]

[D] Regarding Encryption of Deep learning models

My team works on deploying models on the edge (android mobile devices). The data, model, code, everything resides on the client device. Is there any way to protect your model from being probed into by the client? The data and predictions can be unencrypted. Please let me know your thoughts on this and any resources you can point me to. Thanks!

submitted by /u/aseembits93
[link] [comments]

[D] Explaining your ICLR reviews

Since the respective review threads are always flooded with people talking about their reviews, I thought I should post some stats on various reviews. These are my guesses about how you should interpret your reviews (based off of previous year’s trends). Note that the quantization of reviews from a 1-10 scale to a [1,3,6,8] scale may make historical data less predictive, especially in the presence of rebuttals.

So what do your reviews mean? First, know that the acceptance rate at ICLR has historically been around 30%.

[3,3,3] or below: Unfortunately, you are in the bottom third of papers. Typically, there will only be a handful of papers that the AC’s will rescue from this category. If you can convince 2 of your reviewers to bump your ratings up, you have a good shot.

[3,3,6]: Your ratings aren’t ideal, but you still have a good shot of being accepted. This year, 20% of papers got this rating (top 40-60 percentile). You’re one good rebuttal away from having a good shot of being accepted. If you don’t succeed in raising your scores, only about 10% of the papers in this range have been accepted, historically speaking.

[3,6,6]: Congrats on the good ratings! Although you certainly aren’t guaranteed acceptance, you definitely have a solid shot of acceptance. This year, [3,6,6] made up the 20-30 percentile of reviews. The results for papers in this range are the opposite of the 40-60 percentile range – only about 15% of papers in this range get rejected.

[6,6,6] or above: Congratulations on the likely acceptance! This year, you are in the top 10% of papers. Usually, you could probably sit back and relax – papers in the top 10% of ratings are nearly never rejected. However, the flatness of the ratings this year makes that a much riskier endeavour.

Now, let’s talk about rebuttals. Unfortunately, rebuttals often don’t affect the reviewers’ ratings as much as you’d like. Historically, across all papers, only ~30% of papers have any reviewers update their scores following the rebuttal. Among borderline papers (ie: papers in the 30-60 percentiles), however, about 50% of papers will have reviewers update the scores.

Overall, what this means that roughly a quarter of the papers in the “borderline” category will move into “accepted”. Given that the most common rating delta after rebuttals was +1, I don’t know how reviewers will behave with the new quantized scores.

Good luck on your rebuttals, and remember that the reviewing system has an immense amount of random noise. This year, 47% of reviews were done by reviewers who said that they had not published in the area they’re reviewing. Previously, NeurIPS has done a study showing that when the same papers were given to 2 separate AC’s, 57% of the papers that one AC accepted were rejected by the other AC. See http://blog.mrtz.org/2014/12/15/the-nips-experiment.html for more reading.

submitted by /u/programmerChilli
[link] [comments]

[D] OpenAI releases GPT-2 1.5B model despite “extremist groups can use GPT-2 for misuse” but “no strong evidence of misuse so far”.

The findings:

  1. Humans find GPT-2 outputs convincing
  2. GPT-2 can be fine-tuned for misuse
  3. Detection is challenging
  4. We’ve seen no strong evidence of misuse so far
  5. We need standards for studying bias

They are going against their own word, but nevertheless, it’s nice to see that they are releasing everything.

Read the full blog post here: https://openai.com/blog/gpt-2-1-5b-release/

GitHub Model: https://github.com/openai/gpt-2

GitHub Dataset: https://github.com/openai/gpt-2-output-dataset

submitted by /u/permalip
[link] [comments]

[D] Looking for university course selection advice

Looking for course selection for my final two semesters of school as a CS major.

So far I’ve only completed a course on Machine Learning (similar level to Stanford’s CS229) and the rest have been CS courses not related to ML. Aside from discrete maths,I haven’t taken any additional math courses and all of it has been self-taught (probability, linear algebra, calculus).

My goal is to graduate having a deep understanding of how ML algorithm work at the mathematical level and be able to understand most of the maths in ML papers. I’m not looking to do a PhD per se, but I’d like to be more of an “academic ML engineer”. My particular interests are ML and NLP applied in healthcare.

With all that said, which courses should I pick over the next two semesters to optimize my goals? Keep in mind that I’ll be doing applied ML research under a professor in both semesters as well (likely to do with analyzing text in the healthcare setting). In an ideal world I’d take all these courses because they all seem super interesting, but with limited time, I’d rather pick the ones which will give me a solid foundation so I can self-learn the others later on.

Spring 2020 (Pick two):

Computational Linguistics (NLP)

Deep Learning for Data Science

Modern Convex Optimization

Bayesian Statistics

Fall 2020 (Pick two):

Elements of Probability Theory and Random Processes

Mathematical Statistics

Introduction to Optimization Theory

Advanced Analysis

Thanks!

submitted by /u/drhectapus
[link] [comments]

[D] Location of ranked list of architectures that perform well on imagenet?

At some point in time, I swear I came across a ranked table of architectures that perform well on imagenet with one of the columns including the size of the architecture. I have no idea where it went, but it was an incredibly useful resource for me to argue the inclusion of certain smaller networks in project discussions. I distinctly remember it was on wikipedia, but I searched through the pages for imagenet & CNN and none of them seem to include this table anymore

submitted by /u/eddiemcenrue
[link] [comments]