Learn About Our Meetup

4200+ Members

Category: Global

Machine teaching: How people’s expertise makes AI even more powerful

Most people wouldn’t think to teach five-year-olds how to hit a baseball by handing them a bat and ball, telling them to toss the objects into the air in a zillion different combinations and hoping they figure out how the two things connect.

And yet, this is in some ways how we approach machine learning today — by showing machines a lot of data and expecting them to learn associations or find patterns on their own.

For many of the most common applications of AI technologies today, such as simple text or image recognition, this works extremely well.

But as the desire to use AI for more scenarios has grown, Microsoft scientists and product developers have pioneered a complementary approach called machine teaching. This relies on people’s expertise to break a problem into easier tasks and give machine learning models important clues about how to find a solution faster. It’s like teaching a child to hit a home run by first putting the ball on the tee, then tossing an underhand pitch and eventually moving on to fastballs.

“This feels very natural and intuitive when we talk about this in human terms but when we switch to machine learning, everybody’s mindset, whether they realize it or not, is ‘let’s just throw fastballs at the system,’” said Mark Hammond, Microsoft general manager for Business AI. “Machine teaching is a set of tools that helps you stop doing that.”

Machine teaching seeks to gain knowledge from people rather than extracting knowledge from data alone. A person who understands the task at hand — whether how to decide which department in a company should receive an incoming email or how to automatically position wind turbines to generate more energy — would first decompose that problem into smaller parts. Then they would provide a limited number of examples, or the equivalent of lesson plans, to help the machine learning algorithms solve it.

In supervised learning scenarios, machine teaching is particularly useful when little or no labeled training data exists for the machine learning algorithms because an industry or company’s needs are so specific.


YouTube Video

In difficult and ambiguous reinforcement learning scenarios — where algorithms have trouble figuring out which of millions of possible actions it should take to master tasks in the physical world — machine teaching can dramatically shortcut the time it takes an intelligent agent to find the solution.

It’s also part of larger goal to enable a broader swath of people to use AI in more sophisticated ways. Machine teaching allows developers or subject matter experts with little AI expertise, such as lawyers, accountants, engineers, nurses or forklift operators, to impart important abstract concepts to an intelligent system, which then performs the machine learning mechanics in the background.

Microsoft researchers began exploring machine teaching principles nearly a decade ago, and those concepts are now working their way into products that help companies build everything from intelligent customer service bots to autonomous systems.

“Even the smartest AI will struggle by itself to learn how to do some of the deeply complex tasks that are common in the real world. So you need an approach like this, with people guiding AI systems to learn the things that we already know,” said Gurdeep Pall, Microsoft corporate vice president for Business AI. “Taking this turnkey AI and having non-experts use it to do much more complex tasks is really the sweet spot for machine teaching.”

Today, if we are trying to teach a machine learning algorithm to learn what a table is, we could easily find a dataset with pictures of tables, chairs and lamps that have been meticulously labeled. After exposing the algorithm to countless labeled examples, it learns to recognize a table’s characteristics.

But if you had to teach a person how to recognize a table, you’d probably start by explaining that it has four legs and a flat top. If you saw the person also putting chairs in that category, you’d further explain that a chair has a back and a table doesn’t. These abstractions and feedback loops are key to how people learn, and they can also augment traditional approaches to machine learning.

“If you can teach something to another person, you should be able to teach it to a machine using language that is very close to how humans learn,” said Patrice Simard, Microsoft distinguished engineer who pioneered the company’s machine teaching work for Microsoft Research. This month, his team moves to the Experiences and Devices group to continue this work and further integrate machine teaching with conversational AI offerings.

Machine teaching researchers Patrice Simard, Alici Edelman Pelton and Riham Mansour sit in their Microsoft research office
Microsoft researchers Patrice Simard, Alicia Edelman Pelton and Riham Mansour (left to right) are working to infuse machine teaching into Microsoft products. Photo by Dan DeLong for Microsoft.

Millions of potential AI users

Simard first started thinking about a new paradigm for building AI systems when he noticed that nearly all the papers at machine learning conferences focused on improving the performance of algorithms on carefully curated benchmarks. But in the real world, he realized, teaching is an equally or arguably more important component to learning, especially for simple tasks where limited data is available.

If you wanted to teach an AI system how to pick the best car but only had a few examples that were labeled “good” and “bad,” it might infer from that limited information that a defining characteristic of a good car is that the fourth number of its license plate is a “2.” But pointing the AI system to the same characteristics that you would tell your teenager to consider — gas mileage, safety ratings, crash test results, price — enables the algorithms to recognize good and bad cars correctly, despite the limited availability of labeled examples.

In supervised learning scenarios, machine teaching improves models by identifying these high-level meaningful features. As in programming, the art of machine teaching also involves the decomposition of tasks into simpler tasks. If the necessary features do not exist, they can be created using sub-models that use lower level features and are simple enough to be learned from a few examples. If the system consistently makes the same mistake, errors can be eliminated by adding features or examples.

One of the first Microsoft products to employ machine teaching concepts is Language Understanding, a tool in Azure Cognitive Services that identifies intent and key concepts from short text. It’s been used by companies ranging from UPS and Progressive Insurance to Telefonica to develop intelligent customer service bots.

“To know whether a customer has a question about billing or a service plan, you don’t have to give us every example of the question. You can provide four or five, along with the features and the keywords that are important in that domain, and Language Understanding takes care of the machinery in the background,” said Riham Mansour, principal software engineering manager responsible for Language Understanding.

Microsoft researchers are exploring how to apply machine teaching concepts to more complicated problems, like classifying longer documents, email and even images. They’re also working to make the teaching process more intuitive, such as suggesting to users which features might be important to solving the task.

Imagine a company wants to use AI to scan through all its documents and emails from the last year to find out how many quotes were sent out and how many of those resulted in a sale, said Alicia Edelman Pelton, principal program manager for the Microsoft Machine Teaching Group.

As a first step, the system has to know how to identify a quote from a contract or an invoice. Oftentimes, no labeled training data exists for that kind of task, particularly if each salesperson in the company handles it a little differently.

If the system was using traditional machine learning techniques, the company would need to outsource that process, sending thousands of sample documents and detailed instructions so an army of people can attempt to label them correctly — a process that can take months of back and forth to eliminate error and find all the relevant examples. They’ll also need a machine learning expert, who will be in high demand, to build the machine learning model. And if new salespeople start using different formats that the system wasn’t trained on, the model gets confused and stops working well.

By contrast, Pelton said, Microsoft’s machine teaching approach would use a person inside the company to identify the defining features and structures commonly found in a quote: something sent from a salesperson, an external customer’s name, words like “quotation” or “delivery date,” “product,” “quantity,” or “payment terms.”

It would translate that person’s expertise into language that a machine can understand and use a machine learning algorithm that’s been preselected to perform that task. That can help customers build customized AI solutions in a fraction of the time using the expertise that already exists within their organization, Pelton said.

Pelton noted that there are countless people in the world “who understand their businesses and can describe the important concepts — a lawyer who says, ‘oh, I know what a contract looks like and I know what a summons looks like and I can give you the clues to tell the difference.’”

Microsoft CVP Gurdeep Pall talks in front of a presentation on a TV monitor
Microsoft Corporate Vice President for Business AI Gurdeep Pall talks at a recent conference about autonomous systems solutions that employ machine teaching. Photo by Dan DeLong for Microsoft.

Making hard problems truly solvable

More than a decade ago, Hammond was working as a systems programmer in a Yale neuroscience lab and noticed how scientists used a step-by-step approach to train animals to perform tasks for their studies. He had a similar epiphany about borrowing those lessons to teach machines.

That ultimately led him to found Bonsai, which was acquired by Microsoft last year. It combines machine teaching with deep reinforcement learning and simulation to help companies develop “brains” that run autonomous systems in applications ranging from robotics and manufacturing to energy and building management. The platform uses a programming language called Inkling to help developers and even subject matter experts decompose problems and write AI programs.

Deep reinforcement learning, a branch of AI in which algorithms learn by trial and error based on a system of rewards, has successfully outperformed people in video games. But those models have struggled to master more complicated real-world industrial tasks, Hammond said.

Adding a machine teaching layer — or infusing an organization’s unique subject matter expertise directly into a deep reinforcement learning model — can dramatically reduce the time it takes to find solutions to these deeply complex real-world problems, Hammond said.

For instance, imagine a manufacturing company wants to train an AI agent to autonomously calibrate a critical piece of equipment that can be thrown out of whack as temperature or humidity fluctuates or after it’s been in use for some time. A person would use the Inkling language to create a “lesson plan” that outlines relevant information to perform the task and to monitor whether the system is performing well.

Armed with that information from its machine teaching component, the Bonsai system would select the best reinforcement learning model and create an AI “brain” to reduce expensive downtime by autonomously calibrating the equipment. It would test different actions in a simulated environment and be rewarded or penalized depending on how quickly and precisely it performs the calibration.

Telling that AI brain what’s important to focus on at the outset can short circuit a lot of fruitless and time-consuming exploration as it tries to learn in simulation what does and doesn’t work, Hammond said.

“The reason machine teaching proves critical is because if you just use reinforcement learning naively and don’t give it any information on how to solve the problem, it’s going to explore randomly and will maybe hopefully — but frequently not ever — hit on a solution that works,” Hammond said. “It makes problems truly solvable whereas without machine teaching they aren’t.”

Related machine teaching links:

 Jennifer Langston writes about Microsoft research and innovation. Follow her on Twitter.

The post Machine teaching: How people’s expertise makes AI even more powerful appeared first on The AI Blog.

SpecAugment: A New Data Augmentation Method for Automatic Speech Recognition

Automatic Speech Recognition (ASR), the process of taking an audio input and transcribing it to text, has benefited greatly from the ongoing development of deep neural networks. As a result, ASR has become ubiquitous in many modern devices and products, such as Google Assistant, Google Home and YouTube. Nevertheless, there remain many important challenges in developing deep learning-based ASR systems. One such challenge is that ASR models, which have many parameters, tend to overfit the training data and have a hard time generalizing to unseen data when the training set is not extensive enough.

In the absence of an adequate volume of training data, it is possible to increase the effective size of existing data through the process of data augmentation, which has contributed to significantly improving the performance of deep networks in the domain of image classification. In the case of speech recognition, augmentation traditionally involves deforming the audio waveform used for training in some fashion (e.g., by speeding it up or slowing it down), or adding background noise. This has the effect of making the dataset effectively larger, as multiple augmented versions of a single input is fed into the network over the course of training, and also helps the network become robust by forcing it to learn relevant features. However, existing conventional methods of augmenting audio input introduces additional computational cost and sometimes requires additional data.

In our recent paper, “SpecAugment: A Simple Data Augmentation Method for Automatic Speech Recognition”, we take a new approach to augmenting audio data, treating it as a visual problem rather than an audio one. Instead of augmenting the input audio waveform as is traditionally done, SpecAugment applies an augmentation policy directly to the audio spectrogram (i.e., an image representation of the waveform). This method is simple, computationally cheap to apply, and does not require additional data. It is also surprisingly effective in improving the performance of ASR networks, demonstrating state-of-the-art performance on the ASR tasks LibriSpeech 960h and Switchboard 300h.

In traditional ASR, the audio waveform is typically encoded as a visual representation, such as a spectrogram, before being input as training data for the network. Augmentation of training data is normally applied to the waveform audio before it is converted into the spectrogram, such that after every iteration, new spectrograms must be generated. In our approach, we investigate the approach of augmenting the spectrogram itself, rather than the waveform data. Since the augmentation is applied directly to the input features of the network, it can be run online during training without significantly impacting training speed.

A waveform is typically converted into a visual representation (in our case, a log mel spectrogram; steps 1 through 3 of this article) before being fed into a network.

SpecAugment modifies the spectrogram by warping it in the time direction, masking blocks of consecutive frequency channels, and masking blocks of utterances in time. These augmentations have been chosen to help the network to be robust against deformations in the time direction, partial loss of frequency information and partial loss of small segments of speech of the input. An example of such an augmentation policy is displayed below.

The log mel spectrogram is augmented by warping in the time direction, and masking (multiple) blocks of consecutive time steps (vertical masks) and mel frequency channels (horizontal masks). The masked portion of the spectrogram is displayed in purple for emphasis.

To test SpecAugment, we performed some experiments with the LibriSpeech dataset, where we took three Listen Attend and Spell (LAS) networks, end-to-end networks commonly used for speech recognition, and compared the test performance between networks trained with and without augmentation. The performance of an ASR network is measured by the Word Error Rate (WER) of the transcript produced by the network against the target transcript. Here, all hyperparameters were kept the same, and only the data fed into the network was altered. We found that SpecAugment improves network performance without any additional adjustments to the network or training parameters.

Performance of networks on the test sets of LibriSpeech with and without augmentation. The LibriSpeech test set is divided into two portions, test-clean and test-other, the latter of which contains noisier audio data.

More importantly, SpecAugment prevents the network from over-fitting by giving it deliberately corrupted data. As an example of this, below we show how the WER for the training set and the development (or dev) set evolves through training with and without augmentation. We see that without augmentation, the network achieves near-perfect performance on the training set, while grossly under-performing on both the clean and noisy dev set. On the other hand, with augmentation, the network struggles to perform as well on the training set, but actually shows better performance on the clean dev set, and shows comparable performance on the noisy dev set. This suggests that the network is no longer over-fitting the training data, and that improving training performance would lead to better test performance.

Training, clean (dev-clean) and noisy (dev-other) development set performance with and without augmentation.

State-of-the-Art Results
We can now focus on improving training performance, which can be done by adding more capacity to the networks by making them larger. By doing this in conjunction with increasing training time, we were able to get state-of-the-art (SOTA) results on the tasks LibriSpeech 960h and Switchboard 300h.

Word error rates (%) for state-of-the-art results for the tasks LibriSpeech 960h and Switchboard 300h. The test set for both tasks have a clean (clean/Switchboard) and a noisy (other/CallHome) subset. Previous SOTA results taken from Li et. al (2019), Yang et. al (2018) and Zeyer et. al (2018).

The simple augmentation scheme we have used is surprisingly powerful—we are able to improve the performance of the end-to-end LAS networks so much that it surpasses those of classical ASR models, which traditionally did much better on smaller academic datasets such as LibriSpeech or Switchboard.

Performance of various classes of networks on LibriSpeech and Switchboard tasks. The performance of LAS models is compared to classical (e.g., HMM) and other end-to-end models (e.g., CTC/ASG) over time.

Language Models
Language models (LMs), which are trained on a bigger corpus of text-only data, have played a significant role in improving the performance of an ASR network by leveraging information learned from text. However, LMs typically need to be trained separately from the ASR network, and can be very large in memory, making it hard to fit on a small device, such as a phone. An unexpected outcome of our research was that models trained with SpecAugment out-performed all prior methods even without the aid of a language model. While our networks still benefit from adding an LM, our results are encouraging in that it suggests the possibility of training networks that can be used for practical purposes without the aid of an LM.

Word error rates for LibriSpeech and Switchboard tasks with and without LMs. SpecAugment outperforms previous state-of-the-art even before the inclusion of a language model.

Most of the work on ASR in the past has been focused on looking for better networks to train. Our work demonstrates that looking for better ways to train networks is a promising alternative direction of research.

We would like to thank the co-authors of our paper Chung-Cheng Chiu, Ekin Dogus Cubuk, Quoc Le, Yu Zhang and Barret Zoph. We also thank Yuan Cao, Ciprian Chelba, Kazuki Irie, Ye Jia, Anjuli Kannan, Patrick Nguyen, Vijay Peddinti, Rohit Prabhavalkar, Yonghui Wu and Shuyuan Zhang for useful discussions.

Lean, Green, AI Machines: 5 Projects Using GPUs for a Better Planet

Earth Day is a good day for AI.

And the benefits are felt all year, all around the world, as deep learning and NVIDIA GPUs aid our understanding of ecosystems and climate patterns, preserving plants and animals, and managing waste.

Here are five ways companies, researchers and scientists are using GPUs for a better planet:

Into the Woods AI Goes

Whether in a rainforest or urban green spaces, life on Earth relies heavily on trees. But manually monitoring forested areas to track potential risks to plant health is time consuming and costly.

Portugal-based startup is using AI to monitor forests from satellite imagery in a fraction of the time currently required. It uses NVIDIA GPUs inhouse and in the cloud to process some 100TB of new satellite data daily, helping clients analyze tree species, growth and productivity.

A Cloudy Picture of Climate Change

The Earth is warming, but at what rate? Climate models vary in their projections of global temperature rise in the coming years, from 1.5 degrees to more than three degrees by 2100. This variation is largely due to the difficulty of representing clouds in global climate models.

Neural networks can be used to address this cloud resolution challenge, researchers from Columbia University, UC Irvine and the University of Munich found. Developed using an assortment of NVIDIA GPUs, their deep learning model improved performance and provided better predictions for precipitation extremes than the original climate model. This detailed view can improve scientists’ ability to predict regional climate impact.

One Person’s Trash Is an AI’s Treasure

Trying to correctly sort the remains of a lunch into compost, recycling and landfill is a hard enough task for the average person. But if different types of waste are collected together and sent to recycling centers, the trash often can’t be sorted and it all ends up in a landfill. Only 29 percent of the municipal waste generated in Europe in 2017 was recycled.

Smart recycling startup Bin-e hopes to raise the recycling rate with deep learning. Using the NVIDIA Jetson TX1, the startup has created a smart recycling bin that automatically recognizes, sorts and compresses waste. Its AI, trained on NVIDIA TITAN Xp GPUs, takes an image of each piece of trash and determines whether it’s paper, aluminum, plastic or e-waste before depositing it into the correct bin.

Sequencing on Land and at Sea

DNA sequencing isn’t just for the human genome. Nanopore sequencing, a technique for DNA sequencing, can be used to analyze the genomes of plants and microorganisms. UK.startup Oxford Nanopore Technologies is using recurrent neural networks to help scientists detect pathogens in cassava plant genomes.

It’s also analyzed the DNA of microbial sea life off the coast of Alaska, giving researchers a better understanding of ocean biodiversity and the effects of climate change on marine microorganisms.

Oxford Nanopore’s MinIT hand-held AI supercomputer is powered by NVIDIA AGX, enabling researchers to run sequence analysis in the field.

Whale, AI’ll Say

Due to centuries of whaling by humans, just 500 North Atlantic right whales still exist. Those left have been forced by climate change to adopt a new migration path — exposing them to a new threat: commercial shipping vessels that can accidentally strike whales as they pass through shipping lanes.

Autonomous drone company Planck Aerosystems is working with Transport Canada, the national transportation department, to identify whales from aerial drone imagery with AI and NVIDIA GPUs. The tool can help biologists narrow down thousands of images to identify the few containing whales, so ships can slow down and avoid the endangered creatures.

Learn more about how GPU technology is driving applications with social impact, including environmental projects.

The post Lean, Green, AI Machines: 5 Projects Using GPUs for a Better Planet appeared first on The Official NVIDIA Blog.

How to Train Your Robot: Robot Learning Guru Shares Advice for College Students

Whether you’re a robot or a college student, it helps to start with the fundamentals, says a leading robotics researcher.

While robots can do amazing things, compare even the most advanced robots to a three-year-old and they can come up short.

Pieter Abbeel, a professor at the University of California, Berkeley, and cofounder of, an AI company, has pioneered the idea that deep learning could be the key to bridging that gap: creating robots that can learn how to move through the world more fluidly and naturally.

Or as Abbeel refers to it, building “brains for robots.”

Teaching robots new skills is similar to taking classes in college, Abbeel explained on the latest episode of the AI Podcast.

While college courses may not immediately qualify a student for a job, the classes are still important in helping students develop fundamental skills they can apply in all kinds of situations.

Abbeel uses the same approach in his robotics research. At last month’s GPU Technology Conference, he showed a robot learning to navigate a new building it’s never been in before. His talk will be available here starting May 1.

The robot was able to do that because it was applying principles it had learned by navigating other buildings. “What were [the robot’s] courses in its college curriculum were the many other buildings that it was also learning to navigate,” he said. “So it learned a generic skill of navigating new buildings.”

Similarly, college students should look for classes that can teach them skills they can apply broadly.

Getting Physical

For younger students interested in getting a head start in AI and deep learning, Abbeel encourages them to look into physics.

“When I think about the foundations, the things you would learn early on, that will help a lot — they’re essentially mathematics and computer science and physics,” said Abbeel.

“And the reason I say ‘physics,’ which might be slightly more unexpected in the lineup, is that physics is all about looking at the world and building abstractions of how the world works,” he said.

Abbeel also recommends getting involved in research.

“It’s a lot about taking initiative, trying things,” Abbeel said. “The research cycle is a lot about just trying things people haven’t tried before and trying them quickly and understanding how to simplify things.”

How to Tune into the AI Podcast

Our AI Podcast is available through iTunes, Castbox, DoggCatcher, Google Play Music, Overcast, PlayerFM, Podbay, PodBean, Pocket Casts, PodCruncher, PodKicker, Stitcher, Soundcloud and TuneIn.

If your favorite isn’t listed here, email us at aipodcast [at] nvidia [dot] com.

The post How to Train Your Robot: Robot Learning Guru Shares Advice for College Students appeared first on The Official NVIDIA Blog.

Going Against the Grain: How Lucidyne Is Revolutionizing Lumber Grading with Deep Learning

Talk about a knotty problem.

Lucidyne Technologies has been using AI since the late 1980s to detect defects in lumber products.

But no matter how much technology it’s employed, finding imperfections in wood boards — a process that’s critical to categorizing lumber and thus maximizing its value — has remained a challenge.

“This isn’t like being in a factory and scanning cogs. These are all like snowflakes,” said Patrick Freeman, CTO of the small, Corvallis, Oregon-based company. “There’s never been a knot that looks like another one.”

It’s a job tailor-made for AI, and Lucidyne has jumped in with both feet by building a cutting-edge scanning system for lumber mills that’s powered by GPU-enabled deep learning.

With lumber flying through at speeds of up to 35 mph, the company’s GradeScan system — which physically resembles a mashup of an assembly line station and an MRI machine — scans two boards a second. It detects and collects visual data on 70 different types of defects, such as knots, fire scars, pitch pockets and seams.

Lucidyne redwood scan
Lucidyne’s system detects numerous kinds of defects in lumber: orange = bark pocket; forest green = fire scar; red = live knots; bright green = dead knots; blue = pitch pockets; pink = minor seam.

It then applies a deep learning model trained on a combination of NVIDIA GPUs, with a dataset of hundreds of thousands of scanned boards across 16 tree species, all of which have been classified by a team of lumber-grading experts.

To generate the most revenue, the model’s underlying algorithm determines the optimal way to cut each board — navigating around defects measuring as little as 8/1,000th of an inch. Those instructions are then sent to the mill’s saws.

Each mill’s findings are fed back into Lucidyne’s dataset, continuously improving the accuracy and precision of its deep learning model. Thus, there’s no end to how much mills will be able to learn about the lumber they’re milling.

Unprecedented Accuracy and Precision

A typical scanning application might involve categorizing lumber into one of six grade types, with grade 1 being the most valuable, for example. After scanning a 20-foot board, Lucidyne’s system might determine that the best cut will remove a 2-foot defective section near the center, leaving two 8-foot grade 1 and 2 sections on either side, and an additional 2-foot section of trim, which might be sold to a sawdust manufacturer.

This level of detail separates Lucidyne from the competition by enabling mills to drastically improve the precision of their lumber-grading efforts.

“Going to deep learning has allowed us to be a lot more accurate, and our customers produce packs that are 2 percent below or above grade,” said Dan Robin, Lucidyne’s software engineering manager. “No one else is coming even close to that.”

Lucidyne GradeScan system
Lucidyne’s GradeScan system.

Lucidyne started deploying GradeScan systems, powered by its Perceptive Sight software, in 2017, with each unit performing inference on NVIDIA P4 GPUs. The company is now deploying systems with newer NVIDIA T4 GPUs.

Freeman said the new system is delivering 16x the data processing speed, and at a higher image resolution to boot.

The upshot is that Lucidyne’s decision to travel a deep learning path toward increasingly detailed identification of defects has paid off exactly as it hoped.

Raising the Bar

“We wanted to up our game,” said Freeman. “We sought to improve our accuracy on currently detected defects, to correctly classify defects we had never been able to call before, while at the same time delivering more timely solutions and to a larger customer base.”

To that end, the company is working with NVIDIA to develop customized software that extends fine-grain inferencing capabilities using semantic segmentation.

In the meantime, Lucidyne is riding every wave of increased computing power to zoom in on smaller and more subtle defects. It has recently begun grading redwood, which is much harder to scan because of its color variations. It’s also looking to expand into hardwoods and eventually hopes to tackle other challenges faced by mills.

All of this innovation has Lucidyne’s technical leaders feeling that they’re onto something bigger. As a result, they have an eye on disrupting other sectors where inspection of organic materials is involved.

Said Freeman, “What we’re doing that we think is unique is taking industrial deep learning inspection to the next level.”

The post Going Against the Grain: How Lucidyne Is Revolutionizing Lumber Grading with Deep Learning appeared first on The Official NVIDIA Blog.

MorphNet: Towards Faster and Smaller Neural Networks

Deep neural networks (DNNs) have demonstrated remarkable effectiveness in solving hard problems of practical relevance such as image classification, text recognition and speech transcription. However, designing a suitable DNN architecture for a given problem continues to be a challenging task. Given the large search space of possible architectures, designing a network from scratch for your specific application can be prohibitively expensive in terms of computational resources and time. Approaches such as Neural Architecture Search and AdaNet use machine learning to search the design space in order to find improved architectures. An alternative is to take an existing architecture for a similar problem and, in one shot, optimize it for the task at hand.

Here we describe MorphNet, a sophisticated technique for neural network model refinement, which takes the latter approach. Originally presented in our paper, “MorphNet: Fast & Simple Resource-Constrained Structure Learning of Deep Networks”, MorphNet takes an existing neural network as input and produces a new neural network that is smaller, faster, and yields better performance tailored to a new problem. We’ve applied the technique to Google-scale problems to design production-serving networks that are both smaller and more accurate, and now we have open sourced the TensorFlow implementation of MorphNet to the community so that you can use it to make your models more efficient.

How it Works
MorphNet optimizes a neural network through a cycle of shrinking and expanding phases. In the shrinking phase, MorphNet identifies inefficient neurons and prunes them from the network by applying a sparsifying regularizer such that the total loss function of the network includes a cost for each neuron. However, rather than applying a uniform cost per neuron, MorphNet calculates a neuron cost with respect to the targeted resource. As training progresses, the optimizer is aware of the resource cost when calculating gradients, and thus learns which neurons are resource-efficient and which can be removed.

As an example, consider how MorphNet calculates the computation cost (e.g., FLOPs) of a neural network. For simplicity, let’s think of a neural network layer represented as a matrix multiplication. In this case, the layer has 2 inputs (xn), 6 weights (a,b,…,f), and 3 outputs (yn; neurons). Using the standard textbook method of multiplying rows and columns, you can work out that evaluating this layer requires 6 multiplications.

Computation cost of neurons.

MorphNet calculates this as the product of input count and output count. Note that although the example on the left shows weight sparsity where two of the weights are 0, we still need to perform all the multiplications to evaluate this layer. However, the middle example shows structured sparsity, where all the weights in the row for neuron yn are 0. MorphNet recognizes that the new output count for this layer is 2, and the number of multiplications for this layer dropped from 6 to 4. Using this idea, MorphNet can determine the incremental cost of every neuron in the network to produce a more efficient model (right) where neuron y3 has been removed.

In the expanding phase, we use a width multiplier to uniformly expand all layer sizes. For example, if we expand by 50%, then an inefficient layer that started with 100 neurons and shrank to 10 would only expand back to 15, while an important layer that only shrank to 80 neurons might expand to 120 and have more resources with which to work. The net effect is re-allocation of computational resources from less efficient parts of the network to parts of the network where they might be more useful.

One could halt MorphNet after the shrinking phase to simply cut back the network to meet a tighter resource budget. This results in a more efficient network in terms of the targeted cost, but can sometimes yield a degradation in accuracy. Alternatively, the user could also complete the expansion phase, which would match the original target resource cost but with improved accuracy. We’ll cover an example of this full implementation later.

Why MorphNet?
There are four key value propositions offered by MorphNet:

  • Targeted Regularization: The approach that MorphNet takes towards regularization is more intentional than other sparsifying regularizers. In particular, the MorphNet approach to induce better sparsification is targeted at the reduction of a particular resource (such as FLOPs per inference or model size). This enables better control of the network structures induced by MorphNet, which can be markedly different depending on the application domain and associated constraints. For example, the left panel of the figure below presents a baseline network with the commonly used ResNet-101 architecture trained on JFT. The structures generated by MorphNet when targeting FLOPs (center, with 40% fewer FLOPs) or model size (right, with 43% fewer weights) are dramatically different. When optimizing for computation cost, higher-resolution neurons in the lower layers of the network tend to be pruned more than lower-resolution neurons in the upper layers. When targeting smaller model size, the pruning tradeoff is the opposite. 
  • Targeted Regularization by MorphNet. Rectangle width is proportional to the number of channels in the layer. The purple bar at the bottom is the input layer. Left: Baseline network used as input to MorphNet. Center: Output applying FLOP regularizer. Right: Output applying size regularizer.

    MorphNet stands out as one of the few solutions available that can target a particular parameter for optimization. This enables it to target parameters for a specific implementation. For example, one could target latency as a first-order optimization parameter in a principled manner by incorporating device-specific compute-time and memory-time.

  • Topology Morphing: As MorphNet learns the number of neurons per layer, the algorithm could encounter a special case of sparsifying all the neurons in a layer. When a layer has 0 neurons, this effectively changes the topology of the network by cutting the affected branch from the network. For example, in the case of a ResNet architecture, MorphNet might keep the skip-connection but remove the residual block as shown below (left). For Inception-style architectures, MorphNet might remove entire parallel towers as shown on the right.
  • Left: MorphNet can remove residual connections in ResNet-style networks. Right: It can also remove parallel towers in Inception-style networks.
  • Scalability: MorphNet learns the new structure in a single training run and is a great approach when your training budget is limited. MorphNet can also be applied directly to expensive networks and datasets. For example, in the comparison above, MorphNet was applied directly to ResNet-101, which was originally trained on JFT at a cost of 100s of GPU-months.
  • Portability: MorphNet produces networks that are “portable” in the sense that they are intended to be retrained from scratch and the weights are not tied to the architecture learning procedure. You don’t have to worry about copying checkpoints or following special training recipes. Simply train your new network as you normally would!

Morphing Networks
As a demonstration, we applied MorphNet to Inception V2 trained on ImageNet by targeting FLOPs (see below). The baseline approach is to use a width multiplier to trade off accuracy and FLOPs by uniformly scaling down the number of outputs for each convolution (red). The MorphNet approach targets FLOPs directly and produces a better trade-off curve when shrinking the model (blue). In this case, FLOP cost is reduced 11% to 15% with the same accuracy as compared to the baseline.

MorphNet applied to Inception V2 on ImageNet. Applying the flop regularizer alone (blue) improves the performance relative to baseline (red) by 11-15%. A full cycle, including both the regularizer and width multiplier, yields an increase in accuracy for the same cost (“x1”; purple), with continued improvement from a second cycle (“x2”; cyan).

At this point, you could choose one of the MorphNet networks to meet a smaller FLOP budget. Alternatively, you could complete the cycle by expanding the network back to the original FLOP cost to achieve better accuracy for the same cost (purple). Repeating the MorphNet shrink/expand cycle again results in another accuracy increase (cyan), leading to a total accuracy gain of 1.1%.

We’ve applied MorphNet to several production-scale image processing models at Google. Using MorphNet resulted in significant reduction in model-size/FLOPs with little to no loss in quality. We invite you to try MorphNet—the open source TensorFlow implementation can be found here, and you can also read the MorphNet paper for more details.

This project is a joint effort of the core team including: Elad Eban, Ariel Gordon, Max Moroz, Yair Movshovitz-Attias, and Andrew Poon. We also extend a special thanks to our collaborators, residents and interns: Shraman Ray Chaudhuri, Bo Chen, Edward Choi, Jesse Dodge, Yonatan Geifman, Hernan Moraldo, Ofir Nachum, Hao Wu, and Tien-Ju Yang for their contributions to this project.

How UnitedHealth Group Is Infusing Deep Learning Into Healthcare Services

In a massive healthcare organization, even a small improvement in workflow can translate to major gains in efficiency. That means lower costs for the healthcare provider and better, faster care for patients.

UnitedHealth Group, one of the largest healthcare companies in the U.S., is turning to GPU-powered AI for these kinds of enhancements. In a talk at the GPU Technology Conference last month, two of the organization’s AI developers shared how it’s adopting deep learning for a variety of applications — from prior authorization of medical procedures to directing phone calls.

“The datasets required to solve these problems are enormous,” said Dima Rekesh, senior distinguished engineer at Optum, the health services platform of UnitedHealth Group. “Deep learning is uniquely suited to solve some of these hard problems through its ability to parse large amounts of data.”

The key challenge for an AI to be usable is getting error rates low enough, Rekesh said. “When you develop a model, you need to cross a threshold of accuracy to the point where you can trust it — to the point where it’s a pleasant experience for someone, whether it’s a call center representative or a medical professional looking at a model’s predictions.”

Deep learning models can meet that high bar, he says.

“AI solutions actually impact not just the operational costs for our company, but also patient services,” said Julie Zhu, chief data scientist and distinguished engineer at Optum. “We could make decisions much earlier, with more accurate treatment recommendations and earlier detection of disease.”

Optum is using a number of NVIDIA GPUs, including a cluster of V100 GPUs and the NVIDIA DGX-1, to power its deep learning work.

This Procedure Is AI Approved

Healthcare providers often need prior authorization, or advance approval from a patient’s insurance plan, before moving forward with a procedure or filling out a prescription. Manually approving procedures currently costs Optum hundreds of labor hours and millions of dollars a year.

In addition to checking whether or not a patient’s insurance plan covers a treatment, the healthcare provider must gather information from several sources to confirm that it’s necessary for a given patient to have a procedure or take a particular medication. With deep learning models, much of this decision-making could eventually be done automatically.

Zhu and her colleagues are developing neural networks that can conduct prior authorization in real time. The AI is currently in production and is being benchmarked against the manual process.

The team found its deep learning model outperforms the traditional machine learning model by a significant margin against a high volume of cases.

“When you have a million cases per year, the impact is really big,” Zhu said. UnitedHealth Group serves 126 million individuals and 80 percent of U.S. hospitals. “Even a small percentage improvement in accuracy will have a huge impact.”

Deep Learning on the Other End of the Line

More than a million people dial UnitedHealth Group each day. As with any large organization, callers are greeted by an automatic voice response system — a phone tree interface with prompts like “Press 1 to reach the emergency department” or “Press 6 for radiology.”

This process can be streamlined with deep learning.

By implementing AI in its call system, UnitedHealth Group can use natural language processing models to understand what callers are looking for and answer automatically, or route them to the right department or service representative.

Rekesh is working on developing neural networks that can accomplish these tasks, with the goals of reducing call length and connecting patients and customers to answers more quickly. To do so, he’s using OpenSeq2Seq, an open-source toolkit for NLP and speech recognition developed by NVIDIA researchers.

“In NLP, deep learning is the only option,” he said. “Other solutions just aren’t accurate enough.”

Deep learning models can also be used to streamline the process of authenticating patients’ identities on the call. For customer representatives, an AI-powered interface can help them during the call by pulling up the patient’s records or providing recommendations on the agent’s computer screens.

Optum plans to deploy some of these deep learning models later this year. The organization is also working on neural network tools for multi-disease prediction and medical claim fraud detection.

Amazon Polly adds Arabic language support

On April 17th, 2019 Amazon Polly launched an Arabic female text-to-speech (TTS) voice called Zeina. This voice is clear and natural-sounding. The voice masters tongue twisters, and it can whisper, just like all other Amazon Polly products. Let’s hear Zeina introduce herself:

Listen now

Voiced by Amazon Polly

Hello, my name is Zeina, I am the Arabic Amazon Polly voice. Very nice to meet you.

مَرْحَباً، اِسْمِي زينة، أَنا اَلْصَوْتُ اَلْعَرَبِيُّ فِي أمازون بولي، سَعِدْتُ بِلِقائِكُم.

And here’s a tongue twister to demonstrate Zeina’s strengths:

Listen now

Voiced by Amazon Polly

The prince of princes ordered to drill a well in the desert, how many R’s in this sentence?

أَمَرَ أَمِيرُ اَلْأُمَراءِ، بِحَفْرِ بِئْرٍ فِي اَلْصَحْراءِ. فَكَمْ راءً فِي ذٰلِكَ؟

Arabic is one of the most widely spoken languages in the world, but – it’s not really a single language at all. It consists of 30 dialects, including its universal form, which is Modern Standard Arabic (MSA). As a result, it’s classified as a macrolanguage and is estimated to be used by over 400 million speakers. Zeina follows the MSA pronunciation, which is the common broadcasting standard across the region. MSA might sometimes sound formal because it differs from day-to-day speaking style. However, it’s the linguistic thread that links the Arabic native-speakers worldwide.

Arabic is written from right to left and includes 28 letters. Short vowels (diacritics) are not part of the Arabic alphabet. As a result, one written form might be pronounced in several different ways with every option carrying its own meaning and representing a different part of speech. Vocalization can’t be performed in isolation because correct pronunciation depends heavily on the linguistic context of each word. In a real life situation Arabic readers add diacritics during reading to disambiguate words and to pronounce them correctly. In the TTS voice development process Arabic requires a diacritizer that predicts the diacritics. The Amazon Arabic TTS voice handles unvocalized Arabic content thanks to the in-build diacritizer. If a customer provides vocalized input, Zeina generates the corresponding audio as well.

Emirates NBD, one of the leading banks in the Middle East, is using Amazon Polly to develop new voice banking solutions to better serve its customers. Suvo Sarkar, Senior Executive Vice President and Group Head – Retail Banking & Wealth Management said, “Emirates NBD has been an early mover in the region in introducing an AI powered virtual assistant, helping customers calling the bank to converse in natural language and access required services quickly. We are now integrating Amazon Polly in English with our automated call center for its quality and lifelike voice and to further enhance customer interactions, and looking to integrate Amazon Polly in Arabic soon. Such technologies will also help us improve our internal efficiencies while delivering better customer experiences.”

“The launch of Arabic support for Amazon Polly comes at a great time as we are gearing up to launch Arabic as a new language on Duolingo. Zeina delivers accurate and natural sounding speech that is important for teaching a language, and matches the quality that we’ve become accustomed to using Amazon Polly for the other languages that we offer,” said Hope Wilson, Learning Scientist at Duolingo – a globally operating eLearning platform offering a portfolio of 84 language courses for more than 30 distinct languages.

“Amazon Polly’s Arabic voice Zeina is impressive,” said Andreas Dolinsek, CTO at iTranslate, a leading translation and dictionary app that offers text (or even object) translation as well as voice-to-voice conversations in over 100 languages. Andreas noted that “we’re taking it into production immediately to replace our current solution, as it will bring vast improvements to the text-to-speech Arabic service that we are offering.”

Amazon Polly is a cloud service that uses advanced deep learning technologies to offer a range of 59 voices in 29 languages to convert written content into human-like speech. The service supports companies in developing digital products that use speech synthesis for a variety of use cases, including automated contact centers, language learning platforms, translation apps, and reading of articles.


About the Author

Marta Smolarek is a Program Manager in the Amazon Text-to-Speech team. At work she connects the dots. In her spare time, she loves to go camping with her family.





Your guide to Amazon re:MARS: Jeff Bezos, Andrew Ng, Robert Downey Jr. and more…  

The inaugural Amazon re:MARS event pairs the best of what’s possible today with perspectives on the future of machine learning, automation, robotics, and space travel. Based on the exclusive MARS event founded by Jeff Bezos, Amazon re:MARS brings together the world of business and technology in a premier thought-leadership event. With more than 100 sessions, business leaders have the opportunity to hear best practices for implementing emerging technology for business value. For developers, re:MARS offers technical breakout sessions and hands-on workshops that dive deep into AI and robotics tools from AWS and Amazon Alexa.

You’ll also hear from leading experts across science, academia, and business. Speakers such as Jeff Bezos, founder and CEO of Amazon; Andrew Ng, founder and CEO of and Landing AI; Robert Downey Jr, actor and producer; and Colin Angle, chairman, CEO and founder of iRobot, will share the latest research and scientific advancements, industry innovations, and their perspectives on how these domains will evolve.

Register today for Amazon re:MARS and visit the session catalog for the latest lineup! There’s a lot in the works. Here’s a taste of the breakout topics and technical content for beginners and advanced technical builders.

Cross-industry sessions for decision makers and technical builders

Precision medicine for healthier lives
Keith Bigelow, GM of Analytics, GE Healthcare

GE Healthcare has developed machine learning models using Amazon SageMaker to track and predict brain devel­opment in a growing fetus. Powered by these models, the new offering SonoCNS drives the placement of the probe to evaluate congenital and neurological issues in the fetus, for example, to accurately measure and understand brain volume growth. Using Amazon SageMaker for machine learning, GE Healthcare’s operators can quickly detect abnormalities, helping to save babies’ lives and give parents peace of mind. For hospitals, this also translates to improved productivity, efficiency, and accuracy.

Intelligent identity and access management
Paul Hurlocker, Vice President, Center for Machine Learning, Capital One

As one of the largest banks in the U.S., Capital One prioritizes a responsible and well-managed data environment and ecosystem. To meet these needs, Capital One has combined machine learning and native AWS graph capabilities to build a platform that proactively informs access levels for individual associates and teams. The platform results in a faster, enhanced on-boarding process, workflow, and productivity for associates, and helps mitigate risk through proactive management of privileges and licenses.

A hype-free and cutting-edge discussion on autonomous driving
Matthew Johnson-Roberson, Associate Professor, University of Michigan

How close are we to fully autonomous vehicles? What would happen if we put the current technology on the road today? What are the problems that still need to be solved? This session will cover the latest advances in self-driving cars without the marketing, providing a true picture of how far we are from never touching a steering wheel again.

How TED uses AI to spread ideas farther and faster
Jenny Zurawell, Director, & Helena Batt, Deputy Director, TED Translators

TED Talks are a powerful way to share ideas and spark dialogue. To make TED content accessible, volunteer trans­lators need to subtitle more than 300,000 minutes of video this year alone. See how TED leverages Amazon Tran­scribe and Amazon Translate to speed up the creation of crowdsourced subtitles, expand the online reach of ideas, and transform subtitle production in media.

From seed to store: Using AI to optimize the indoor farms of the future
Irving Fain, Co-founder and CEO, Bowery Farms

For the last 10,000 years, large-scale agriculture has lived outdoors, optimized to withstand unpredictable environ­mental conditions and long supply chains. But what possibilities do you unlock when you can control every single environmental factor, from the light intensity to nutrient mix to air flow? In this talk, learn how Bowery Farms uses machine learning and computer vision to optimize indoor vertical farms and scale agricultural production to create higher yielding, better tasting, safer, and more sustainable locally-grown produce in cities around the world.

Futuring the farm to improve crop health
Peri Subrahmanya, IoT Product Manager, & Craig Williams, Principal Solution Architect, Bayer Crop Science

One third of all food produced globally is lost or wasted before it is consumed, according to the Food and Agriculture Organization of the United Nations (FAO). This equals a loss of $750 billion annually. With AWS IoT, Bayer Crop Sci­ence can prevent process loss in real time and use real-time data collection and analysis for its global seed business, collecting an average of one million traits per day during planting or harvest season.

AI, spatial understanding, robots, and the smart home
Chris Jones, Chief Technology Officer, iRobot

Consumers increasingly expect connected products in their home to deliver easy-to-use and personalized experiences tailored to their home and activity. To deliver such a personalized experience, the smart home needs to intelligently coordinate diverse connected devices located throughout the home. This talk will focus on how robots operating in homes today are ideally positioned to enable this intelligence by providing a constantly updated under­standing of the physical layout of the home and the locations of each connected device within the space.

Predicting weather to save energy costs
Andrew Stypa, Lead AI/ML Business Analyst, & Richard Scott, Global Marketing Director, Kinect Energy Group

Learn how Kinect Energy Group uses advanced machine learning capabilities to predict electric spot prices for re­gional power markets using the Amazon SageMaker DeepAR time-series forecasting model, incorporating historical pricing and weather data to drive the machine learning models. Improved price predictions assist with increased trading volumes for forward pricing contracts.

Creating the intelligent asset: Fusing IoT, robotics, and AI
Jason Crusan, Vice President of Technology, Woodside Energy

What if you could learn more about your facility from your tablet than by walking around it yourself? Through 4D interactive virtual worlds, Woodside’s “Intelligent Asset” offers an immersive experience in which operators can explore their facility remotely in real time. Learn how Woodside, the leading energy provider in Australia, combined the latest AWS services including Amazon Kinesis Video Streams, Amazon SageMaker, AWS RoboMaker, and AWS IoT.

Distributed AI: Alexa living on the edge
Ariya Rastrow, Principal Applied Scientist, Alexa Speech

Distributed edge learning, which leverages on-device computation for training models and centrally aggregated an­onymized updates, is a promising new approach capable of achieving customer-level personalization at scale while addressing privacy and trust concerns. Practitioners, employers, and users of AI should understand this new edge-first paradigm and how it will impact the discipline in the near future.

Applying space-based data and machine learning to the UN’s sustainability goals
Dr. Shay Har-Noy, Vice President, Maxar Technologies

The United Nations has outlined 17 Sustainable Development Goals that address global challenges such as poverty, hunger, and health. While the goals are focused on life on Earth, space-based data and machine learning are yield­ing insights. Learn how a combination of analytics and satellite imagery are helping solve pressing problems.

Enabling sustainable human life in space using AI
Dr. Natalie Rens, CEO & Founder, Astreia

Human settlement of space will pose one of the grandest challenges in history. We imagine entire communities living on the moon and beyond without depending on constant supervision or support from Earth. We’ll discuss the challenges for sustainable life in space, and our plan to use artificial intelligence to ensure the safety and wellbeing of our first space settlers.

From business intelligence to artificial intelligence
Elizabeth Gonzalez, Business Intelligence and Advanced Analytics Leader, INVISTA

INVISTA is a manufacturer of chemicals, polymers, fabrics, and fibers and delivers products and brands incorporated into your clothing, your car, and even your carpet. Join this session to learn about INVISTA’s transformative journey from BI to AI, where they will share their experience empowering data science by adjusting talent and processes and building a modern analytics platform. Experimentation with change manage­ment, project management, model maintenance, and development lifecycles has helped drive profitable innova­tions.

When SpongeBob met Alexa
Zach Johnson, Founder and CEO Xandra, Tim Adams VP, Emerging Products Lab, Viacom

Viacom and Xandra are collaborating to push the limits of voice design with a focus on exceptional user experience. Nickelodeon’s SpongeBob Challenge is one of the highest-rated Alexa skills for kids. Learn how to build delight and fun into an Alexa skill through conversation design, rich soundscapes, advanced game mechanics, and analytics.

Amazon Go: A technology deep dive
Ameet Vaswani, Senior Manager, Software Development, & Gerard Medioni Director, Research, Amazon Go

This technical session will outline the core technologies behind the custom-built Just Walk Out technology for Ama­zon Go. Learn about the algorithmic challenges in building a highly accurate customer-facing application using deep learning and computer vision, and the technical details of the high throughput services for Amazon Go that transfer gigabytes of video from stores to cloud systems.

Mitigate bias in machine learning models
Stefano Soatto, Director of Applied Science, AWS

Using real-world examples, this session will explore how to understand, measure, and systematically mitigate bias in machine learning models. Understanding these principles is an important part of building a machine learning strat­egy. This session will cover both the business and technical considerations.

Realizing nature’s secrets to make bug-like robots
Kaushik Jayaram, Postdoc Scholar, Harvard University

This session will take a closer look at the incredible bodies of cockroaches, geckos, and other small animals to exam­ine what it can teach robotics engineers. The session will also outline the latest developments in the field of mi­crorobotics, real-world applications of these robots, and hint at how close (or far) we are from realizing predictions from science fiction.

A chance encounter, sushi, robots, and the environment
Dr. Erika Angle, Co-Founder, Director of Education, Ixcela

Robots can help save our fragile planet. This session discusses the importance of leveraging robotic technology to save our oceans, detailing efforts underway to create an affordable, unmanned undersea robot designed to dive 1000 feet deep and control the lionfish popula­tion. Intended for use by fisherman, tourists and environmentalists, the RSE Guardian robot will address a serious environmental problem by creating an economically scalable solution for catching lionfish, establishing a new food source, and inspiring future generations in the process.

The open-source bionic leg: Constructing and controlling a prosthesis driven by AI
Elliott J. Rouse, Assistant Professor, Mechanical Engineering, University of Michigan

For decades, sci-fi movies have shown the promise of life with bionic limbs, but they are nowhere to be seen in to­day’s society. We have created an open-source bionic leg to help transform these robots from fiction to reality. This talk will focus on innovations in our design approach and showcase a leading AI-based control strategy for wearable robots. Finally, we’ll demo our open-source bionic leg in action with a participant on stage.

Solving Earth’s biggest problems with a cloud in space
Yvonne Hodge, Vice President of IT, Lockheed Martin Space

Can a cloud in space impact the world’s poverty? Are there ways to make agriculture more efficient? Can internet connectivity for the world change how the world lives? Join this interactive discussion as we consider new approach­es to solving Earth’s problems including how a cloud in space could positively impact our lives using space data.

Where will the road to space take you?
Patrick Zeitouni, Head of Advanced Development Programs, Blue Origin

This year, Blue Origin will send its first astronauts to space on its New Shepard rocket. Democratization of space is key to the company’s long-term mission to enable a future where millions of people are living and working in space, moving heavy industry off Earth to protect and preserve the planet for generations to come. To achieve this, the cost of access to space must be lowered, which is why Blue Origin is focusing on the development of operational­ly reusable rockets to send more humans to space than ever before. Join Patrick Zeitouni, the head of Advanced Development Programs for Blue Origin, on the journey to that future. Hear about operational reuse at work and the important part the Moon plays in humanity realizing this bright future.

AI and Robotics workshops for technical builders

Get started with machine learning using AWS DeepRacer
Ever wondered what it takes to create an autonomous race car? Come join us for this half-day workshop, and you’ll get hands-on experience with reinforcement learning. Developers with no prior machine learning experience will learn new skills and apply their knowledge in a fun and exciting way. You’ll join a pit crew where you will build and train machine learning models that you can then try out with our AWS DeepRacer autonomous race cars! Please bring your laptop and start your engines, the race is on!

Practical machine learning with Amazon SageMaker
Until recently, developing machine learning models took considerable time, effort, and expertise. In this workshop, you’ll learn a simple end-to-end approach to machine learning, from how to select the right algorithms and models for your business needs, how to prepare your data, then how to build, train, and deploy optimized models. Upon completion of this full-day workshop, you’ll have learned the latest machine learning concepts such as reinforcement learning and deep learning by using Amazon SageMaker for predictive insights.

re:Vegas Blackjack
In this session, you’ll use computer vision and machine learning to help your team win the re:MARS Blackjack Challenge. During this half-day course, you’ll form teams to build and train a neural network for computer vision using Amazon SageMaker, and develop an algorithm to make decisions that give your team the best chance to win. The team with the highest simulated earnings will win the re:MARS Blackjack Challenge and a coveted patch commemorating their experience.

Get started with robotics and AI
Teach a robot how to find a needle in a haystack. In this workshop, you’ll learn how to develop a robot that can roam around a room and identify objects it encounters, searching for a specific type of item. You will get hands-on with AWS RoboMaker, and learn how to connect robots to a huge variety of other AWS services, like Amazon Rekognition and Amazon Kinesis. Upon completion, you’ll have trained a robot to find what you’re looking for in a pile of irrelevant data.

Voice control for any “thing”
From microwaves to cars, we are headed towards a future surrounded by devices that can communicate with the world around them. In this hands-on session, you will learn how to add custom voice control to your connected devices with Alexa. Leave your laptop behind—bring your big ideas, and we’ll supply the hardware. You’ll create an Alexa built-in prototype, AWS IoT “thing,” and your own Alexa skill—all on a Raspberry Pi. You’ll walk out with your own voice-enabled prototype that interfaces with whatever inputs and outputs you can imagine.

Building the Starship Enterprise computer today
Nearly every science fiction story has shown us that voice interfaces are the future. This workshop will show you how to make that science fiction a reality. Bring your laptop, because this hands-on session will teach you the advanced topics required for creating compelling voice interfaces. You will learn how to build Alexa skills, how to design conversational experiences, and how to your brand and monetize your best content.


About the author

Cynthya Peranandam is a Principal Marketing Manager for AWS artificial intelligence solutions, helping customers use deep learning to provide business value. In her spare time she likes to run and listen to music.



Next Meetup




Plug yourself into AI and don't miss a beat