Category: Reddit MachineLearning

[Research] What is the State of AutoML in 2019?

Written on August 12, 2019. Posted in Reddit MachineLearning.

https://medium.com/ai%C2%B3-theory-practice-business/what-is-the-state-of-automl-in-2019-64167f581dd1

Abstract—Deep learning has penetrated all aspects of our lives and brought us great convenience. However, the process of building a high-quality deep learning system for a specific task is not only time-consuming but also requires lots of resources and relies on human expertise, which hinders the development of deep learning in both industry and academia. To alleviate this problem, a growing number of research projects focus on automated machine learning (AutoML). In this paper, we provide a comprehensive and up-to-date study on the state-of-the-art AutoML. First, we introduce the AutoML techniques in details according to the machine learning pipeline. Then we summarize existing Neural Architecture Search (NAS) research, which is one of the most popular topics in AutoML. We also compare the models generated by NAS algorithms with those human-designed models. Finally, we present several open problems for future research.

submitted by /u/cdossman
[link] [comments]

[D] For samples of subsets, features are predictive in gradient boost algorithm.

Written on August 12, 2019. Posted in Reddit MachineLearning.

I’m doing a project with my professor. We are analyzing impact of each feature on model performance, mostly we do it for gradient boost. Professor told me that for samples of subsets features are predictive and it is a problem that we can observe this by analyzing the shallow trees created during the process.

What does “samples of subsets features are predictive” means? I have been searching internet but couldn’t find anything. Any ideas?

submitted by /u/DoIHAVeaNIdenTItY
[link] [comments]

[R] A 2019 Guide to Deep Learning-Based Image Compression

Written on August 12, 2019. Posted in Reddit MachineLearning.

Today I look at how deep learning can be used in image compression. The article is from a comprehensive analysis of several research papers.

https://heartbeat.fritz.ai/a-2019-guide-to-deep-learning-based-image-compression-2f5253b4d811

submitted by /u/mwitiderrick
[link] [comments]

[R] Video Analysis: Gauge Equivariant Convolutional Networks and the Icosahedral CNN

Written on August 12, 2019. Posted in Reddit MachineLearning.

Ever wanted to do a convolution on a Klein Bottle? This paper defines CNNs over manifolds such that they are independent of which coordinate frame you choose. Amazingly, this then results in an efficient practical method to achieve state-of-the-art in several tasks!

https://youtu.be/wZWn7Hm8osA

Abstract: The principle of equivariance to symmetry transformations enables a theoretically grounded approach to neural network architecture design. Equivariant networks have shown excellent performance and data efficiency on vision and medical imaging problems that exhibit symmetries. Here we show how this principle can be extended beyond global symmetries to local gauge transformations. This enables the development of a very general class of convolutional neural networks on manifolds that depend only on the intrinsic geometry, and which includes many popular methods from equivariant and geometric deep learning. We implement gauge equivariant CNNs for signals defined on the surface of the icosahedron, which provides a reasonable approximation of the sphere. By choosing to work with this very regular manifold, we are able to implement the gauge equivariant convolution using a single conv2d call, making it a highly scalable and practical alternative to Spherical CNNs. Using this method, we demonstrate substantial improvements over previous methods on the task of segmenting omnidirectional images and global climate patterns.

Authors: Taco S. Cohen, Maurice Weiler, Berkay Kicanaoglu, Max Welling

Paper: https://arxiv.org/abs/1902.04615

submitted by /u/ykilcher
[link] [comments]

[R] Building a Better CartPole – DM’s new RL benchmarking suite

Written on August 12, 2019. Posted in Reddit MachineLearning.

Behaviour Suite for Reinforcement Learning

This paper introduces the Behaviour Suite for Reinforcement Learning, or bsuite for short. bsuite is a collection of carefully-designed experiments that investigate core capabilities of reinforcement learning (RL) agents with two objectives. First, to collect clear, informative and scalable problems that capture key issues in the design of general and efficient learning algorithms. Second, to study agent behaviour through their performance on these shared benchmarks. To complement this effort, we open source this http URL, which automates evaluation and analysis of any agent on bsuite. This library facilitates reproducible and accessible research on the core issues in RL, and ultimately the design of superior learning algorithms. Our code is Python, and easy to use within existing projects. We include examples with OpenAI Baselines, Dopamine as well as new reference implementations. Going forward, we hope to incorporate more excellent experiments from the research community, and commit to a periodic review of bsuite from a committee of prominent researchers.

This is a great paper. While the authors focus on comparing different agents, additional value is going to be in debugging algorithm variants. Many researchers already have their own zoos of duct-taped diagnostic envs to try and localise errors, but the community’s been lacking anything ready-made and well-tested.

What is a little disappointing is that they don’t carry this paper through to ‘here we evaluated 17 different agents and this is the best one’, though presumably other contributors will fix that in short order.

submitted by /u/andyljones
[link] [comments]

[D] How to manage a small Machine Learning lab?

Written on August 12, 2019. Posted in Reddit MachineLearning.

I am building a small machine learning lab at my company ( I am a founder). What ate the differences on managing a ML team vs a dev team ? Any pointers to artickes, books, etc?

submitted by /u/_edmar
[link] [comments]

[D] Interviewing as researcher with big tech companies

Written on August 12, 2019. Posted in Reddit MachineLearning.

Hi all,

I have a couple of interviews coming up for research scientist roles with FAANG-like companies and since it’s my first time going through this process (finishing grad school, never applied to industry positions before) I was wondering how the interviews are conducted and what they look at the most. I can find tons of information on the recruitment process for SWE at big N companies, and how you have to grind leetcode for months to pass the coding bar, but not much info on the more research oriented roles. Do they have the same bar for coding? Do they put more emphasis on your performance during the ML and design interviews? What “signal” do they look for, presumably outside of your research portfolio, which they’d know already by the time you come for an onsite?

Finally, in case anyone here works at a the research groups in major tech companies, what’s been in your experience the interview:offer ratio, is it as bad as for developer roles (I read that for Google it is about 1:7)? I would imagine that for research roles, it is easier to see if there is a good fit on data outside of algo&ds / design interview performance, like research interests and publication record, but maybe I am naive.

Sorry for the many questions, just a scared grad student trying to understand what his chances are 😀

Thanks.

submitted by /u/urhen1512
[link] [comments]

[D] What if AlphaZero search was restricted to human level?

Written on August 12, 2019. Posted in Reddit MachineLearning.

In the Alphazero blog post, there is a graphic comparing the amount of search per decision between grand-master humans, Alphazero and traditional chess engines. Alphazero is in the middle. At first glance, the implication seems to be that humans are still better at “intuitive” play than Alphazero. But then again, Alphazero is significantly stronger than any human player, and so I wonder how well Alphazero would perform if restricted to the same amount of search as a human player.

There is a plot in the AlphaGo Zero nature paper that shows that the raw network on Go with no search has an elo of 3000, almost as high as AlphaGo Fan. That seems to indicate that AlphaZero with the same amount of search as a human player might already be super-human. Especially since I suspect that there might be diminishing returns to search.

Am I missing anything? Might AlphaZero have beaten humans at “intuitive” chess as well?

submitted by /u/samuelknoche
[link] [comments]

[D] How to load subset of large Oracle table into Dask dataframe?

Written on August 12, 2019. Posted in Reddit MachineLearning.

Here’s what I tried:

dask_rf = dd.from_pandas(pd.read_sql('select ...)', conn_cx_Oracle), npartitions = 10)

This gives me a ‘large object’ warning and recommends using client.scatter. Problem is that it appears that client.scatter requires data to be loaded into a Pandas dataframe first, which is why I’m using Dask in the first place because of RAM limitations.

The Oracle table is too large to read using Dask’s read_sql_table because read_sql_table does not filter the table in any way.

Ideas? Dask not applicable to my use case?

submitted by /u/Professionalsimracin
[link] [comments]

[P] NLP Problem

Written on August 12, 2019. Posted in Reddit MachineLearning.

Hi all, I am a beginner stage ML practitioner. I need help.

I have a large dataset of free text for work orders for a power plant. I need to identify the cause, symptom and damage by these work order data. I do have the list of causes, symptoms and damage which is to be used to classify each work order. The issue is I do not have a labelled data set so I have to take the unsupervised approach. But I am not getting useful results.

Could you help me with approaches I can work on to get better results.

Thanks.

submitted by /u/raesharma
[link] [comments]

Blog

Learn About Our Meetup

5000+ Members

MEETUPS

JOB POSTINGS

CONTACT

Category: Reddit MachineLearning

[Research] What is the State of AutoML in 2019?

[D] For samples of subsets, features are predictive in gradient boost algorithm.

[R] A 2019 Guide to Deep Learning-Based Image Compression

[R] Video Analysis: Gauge Equivariant Convolutional Networks and the Icosahedral CNN

[R] Building a Better CartPole – DM’s new RL benchmarking suite

[D] How to manage a small Machine Learning lab?

[D] Interviewing as researcher with big tech companies

[D] What if AlphaZero search was restricted to human level?

[D] How to load subset of large Oracle table into Dask dataframe?

[P] NLP Problem