Author: torontoai

AWS announces the Machine Learning Embark program to help customers train their workforce in machine learning

Written on December 4, 2019. Posted in Amazon.

Today at AWS re:Invent 2019, I’m excited to announce the AWS Machine Learning (ML) Embark program to help companies transform their development teams into machine learning practitioners. AWS ML Embark is based on Amazon’s own experience scaling the use of machine learning inside its own operations as well as the lessons learned through thousands of successful customer implementations. Elements of the program include guided instruction from AWS machine learning experts, a discovery workshop, hand-selected curriculum from the Machine Learning University, an AWS DeepRacer event, and co-development of a machine learning proof of concept at the culmination of the program.

Customers I talk to are eager to get started implementing machine learning in their organizations, but it can be difficult to know where to begin. And, once started, it can be challenging to gain meaningful adoption across the organization. More often, customers are not asking “why” machine learning, but “how.” It’s a cultural shift as much as a technical one. Success involves inspiring and motivating teams to get interested in machine learning, identifying the most impactful projects to tackle, and developing a workforce with the right skills. And, teams new to machine learning need guidance and expertise from more seasoned data scientists who are in short supply. As a result, organizations can often feel like turning the corner on machine learning adoption happens at a glacial pace.

The AWS ML Embark program is designed to help these customers overcome some common challenges in the machine learning journey. To kick off the program, participants will pair their business and technical staff with AWS machine learning experts to join a discovery day workshop to identify a business problem well suited for machine learning. Through this exercise, AWS machine learning experts will help the group work backwards from a problem and align on where machine learning can have meaningful impact.

Next, this cross-functional group will participate in instructor-led, on-site trainings with curriculum modeled after Amazon’s Machine Learning University, which has been refined over the last several years to help Amazon’s own developers become proficient in machine learning. Participants will benefit from hand-selected coursework focused on practical application relevant to their business use cases. At the completion of the training, the AWS ML Embark program offers the option to continue education online and take the AWS Certified Machine Learning – Specialty certification exam to validate their skills.

AWS ML Embark also includes a corporate AWS DeepRacer event to expose a broader group of employees to machine learning with friendly competition and hands-on experience through racing fully autonomous 1/18th scale race cars using reinforcement learning.

Finally, experts from the Amazon ML Solutions Lab mentor participants through the ideation, development, and launch of a proof of concept based on a use case identified in the discovery day workshop. Through the process, the team will gain insight into best practices, ways to avoid costly mistakes, and knowledge based on the overall experience of working with experts who have completed hundreds of machine learning implementations.

At the conclusion of the program, a customer is well prepared to begin scaling newly obtained machine learning capabilities throughout their organization to take on additional machine learning projects and solve new challenges across their business. We’re excited to help customers begin their machine learning journey and can’t wait to see what they’ll do after graduation. Nominations for the program are now being accepted.

About the Author

Michelle Lee is vice president of the Machine Learning Solutions Lab at AWS.

[D] Efficient Partial Dependence Plots with decision trees

Written on December 4, 2019. Posted in Reddit MachineLearning.

Partial Dependence Plots (PDPs) are a standard model inspection technique. It turns out that for decision trees, they can be computed very efficiently. This post explains how PDPs are computed in general, and goes into the details of the optimized version for tree models.

http://nicolas-hug.com/blog/pdps

submitted by /u/Niourf
[link] [comments]

[D] Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead

Written on December 4, 2019. Posted in Reddit MachineLearning.

A recent paper by Cynthia Rudin claims “Stop Explaining Black Box Machine Learning Models for High Stakes Decisions and Use Interpretable Models Instead”: https://arxiv.org/abs/1811.10154

A summary of the paper can be found here: https://www.kdnuggets.com/2019/11/stop-explaining-black-box-models.html

Thoughts?

submitted by /u/selib
[link] [comments]

[D] Validating regression models on edge cases?

Written on December 4, 2019. Posted in Reddit MachineLearning.

I’m trying to predict USED car prices, given some x number of parameters.

The R² is > 0.98 on the testing data, but it misses predictions on new data with edge cases by (what I think of as) too much.

Past the metric for evaluating, how can we validate that a result is good enough, even for an edge case.

Currently, I’m thinking about making some linear regression model and fitting varyingly different age and kilometers, then predicting on price. This would give me a model, where I could predict my edge case predictions on and fit it to a more average case.

I’m really just seeking advice on what to do here. Is the approach good enough? What are other approaches for validation / sanity checking if each sample we try to predict individually is good enough?

submitted by /u/permalip
[link] [comments]

[Research] StarGAN v2: Diverse Image Synthesis for Multiple Domains

Written on December 4, 2019. Posted in Reddit MachineLearning.

Since the last post was removed by the owner and I found it interesting, I publish it back.

Paper: https://arxiv.org/pdf/1912.01865v1.pdf

submitted by /u/albrinbor
[link] [comments]

[P] SparkTorch: Distributed training of PyTorch networks on Apache Spark with ML Pipeline support

Written on December 4, 2019. Posted in Reddit MachineLearning.

SparkTorch is a project that I have wanted to do for awhile, and after Pytorch released a variety of great updates to the distributed package, I decided to build a package that could easily orchestrate training on Apache Spark. The goal was to be able to easily integrate the training of Pytorch models to the Spark ML Pipeline. This was done by creating a custom estimator that could be saved and loaded for inference (or even additional training). Right now, there are two modes of training: distributed synchronous and Hogwild!. I will be continuing work on the project and would definitely enjoy collaboration.

submitted by /u/lodev12
[link] [comments]

[R][P] StarGAN v2: Diverse Image Synthesis for Multiple Domains

Written on December 4, 2019. Posted in Reddit MachineLearning.

![img](ypy74yikfs241 “Diverse image synthesis results on the CelebA-HQ dataset and our newly collected animal faces (AFHQ) dataset. The first column shows input images while the remaining columns are images synthesized by StarGAN v2.”)

!(pdaoqt5rfs241 “StarGAN v2 can transform a source image into an output image reflecting the style (e.g., hairstyle and makeup) of a given reference image. Additional high-quality videos can be found at the link below.”)

arXiv: https://arxiv.org/abs/1912.01865

github: https://github.com/clovaai/stargan-v2

video: shorturl.at/eACS9

Abstract

A good image-to-image translation model should learn a mapping between different visual domains while satisfying the following properties: 1) diversity of generated images and 2) scalability over multiple domains. Existing methods address either of the issues, having limited diversity or multiple models for all domains. We propose StarGAN v2, a single framework that tackles both and shows significantly improved results over the baselines. Experiments on CelebA-HQ and a new animal faces dataset (AFHQ) validate our superiority in terms of visual quality, diversity, and scalability. To better assess image-to-image translation models, we release AFHQ, high-quality animal faces with large inter- and intra-domain variations. The code, pretrained models, and dataset will be released for reproducibility.

submitted by /u/yunjey
[link] [comments]

[D] How to cluster data with features as mean μ_i(x) and σ_i(x) vectors?

Written on December 4, 2019. Posted in Reddit MachineLearning.

Variational Autoencoders would generate μ_i(x) and σ_i(x) from its encoder. I have an image dataset that I would encode and try to cluster them using this generated vectors. Is this possible? How do I do this?

submitted by /u/sarmientoj24
[link] [comments]

Data-Driven Deep Reinforcement Learning

Written on December 4, 2019. Posted in Uncategorized.

One of the primary factors behind the success of machine learning approaches in open world settings, such as image recognition and natural language processing, has been the ability of high-capacity deep neural network function approximators to learn generalizable models from large amounts of data. Deep reinforcement learning methods, however, require active online data collection, where the model actively interacts with its environment. This makes such methods hard to scale to complex real-world problems, where active data collection means that large datasets of experience must be collected for every experiment – this can be expensive and, for systems such as autonomous vehicles or robots, potentially unsafe. In a number of domains of practical interest, such as autonomous driving, robotics, and games, there exist plentiful amounts of previously collected interaction data which, consists of informative behaviours that are a rich source of prior information. Deep RL algorithms that can utilize such prior datasets will not only scale to real-world problems, but will also lead to solutions that generalize substantially better. A data-driven paradigm for reinforcement learning will enable us to pre-train and deploy agents capable of sample-efficient learning in the real-world.

In this work, we ask the following question: Can deep RL algorithms effectively leverage prior collected offline data and learn without interaction with the environment? We refer to this problem statement as fully off-policy RL, previously also called batch RL in literature. A class of deep RL algorithms, known as off-policy RL algorithms can, in principle, learn from previously collected data. Recent off-policy RL algorithms such as Soft Actor-Critic (SAC), QT-Opt, and Rainbow, have demonstrated sample-efficient performance in a number of challenging domains such as robotic manipulation and atari games. However, all of these methods still require online data collection, and their ability to learn from fully off-policy data is limited in practice. In this work, we show why existing deep RL algorithms can fail in the fully off-policy setting. We then propose effective solutions to mitigate these issues.

[D] Preferred Networks (creators of Chainer) migrating it’s research platform to PyTorch from Chainer

Written on December 4, 2019. Posted in Reddit MachineLearning.

Press Release: https://preferred.jp/en/news/pr20191205/

Preferred Networks Migrates its Deep Learning Research Platform to PyTorch

PFN to work with PyTorch and the open-source community to develop the framework and advance MN-Core processor support.

Preferred Networks, Inc. (PFN, Head Office: Tokyo, President & CEO: Toru Nishikawa) today announced plans to incrementally transition its deep learning framework (a fundamental technology in research and development) from PFN’s Chainer™ to PyTorch. Concurrently, PFN will collaborate with Facebook and the other contributors of the PyTorch community to actively participate in the development of PyTorch. With the latest major upgrade v7 released today, Chainer will move into a maintenance phase. PFN will provide documentation and a library to facilitate the migration to PyTorch for Chainer users.

PFN President and CEO Toru Nishikawa made the following comments on this business decision.

“Since the start of deep learning frameworks, Chainer has been PFN’s fundamental technology to support our joint research with Toyota, FANUC, and many other partners. Chainer provided PFN with opportunities to collaborate with major global companies, such as NVIDIA and Microsoft. Migrating to PyTorch from Chainer, which was developed with tremendous support from our partners, the community, and users, is an important decision for PFN. However, we firmly believe that by participating in the development of one of the most actively developed frameworks, PFN can further accelerate the implementation of deep learning technologies, while leveraging the technologies developed in Chainer and searching for new areas that can become a source of competitive advantage.”

Rest of article…

submitted by /u/hardmaru
[link] [comments]

Blog

Learn About Our Meetup

5000+ Members

MEETUPS

JOB POSTINGS

CONTACT

Author: torontoai

AWS announces the Machine Learning Embark program to help customers train their workforce in machine learning

About the Author

[D] Efficient Partial Dependence Plots with decision trees

[D] Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead

[D] Validating regression models on edge cases?

[Research] StarGAN v2: Diverse Image Synthesis for Multiple Domains

[P] SparkTorch: Distributed training of PyTorch networks on Apache Spark with ML Pipeline support

[R][P] StarGAN v2: Diverse Image Synthesis for Multiple Domains

[D] How to cluster data with features as mean μ_i(x) and σ_i(x) vectors?

Data-Driven Deep Reinforcement Learning

[D] Preferred Networks (creators of Chainer) migrating it’s research platform to PyTorch from Chainer