Skip to main content

Blog

Learn About Our Meetup

5000+ Members

MEETUPS

LEARN, CONNECT, SHARE

Join our meetup, learn, connect, share, and get to know your Toronto AI community. 

JOB POSTINGS

INDEED POSTINGS

Browse through the latest deep learning, ai, machine learning postings from Indeed for the GTA.

CONTACT

CONNECT WITH US

Are you looking to sponsor space, be a speaker, or volunteer, feel free to give us a shout.

Category: Global

ROBEL: Robotics Benchmarks for Learning with Low-Cost Robots

Learning-based methods for solving robotic control problems have recently seen significant momentum, driven by the widening availability of simulated benchmarks (like dm_control or OpenAI-Gym) and advancements in flexible and scalable reinforcement learning techniques (DDPG, QT-Opt, or Soft Actor-Critic). While learning through simulation is effective, these simulated environments often encounter difficulty in deploying to real-world robots due to factors such as inaccurate modeling of physical phenomena and system delays. This motivates the need to develop robotic control solutions directly in the real world, on real physical hardware.

The majority of current robotics research on physical hardware is conducted on high-cost, industrial-quality robots (PR2, Kuka-arms, ShadowHand, Baxter, etc.) intended for precise, monitored operation in controlled environments. Furthermore, these robots are designed around traditional control methods that focus on precision, repeatability, and ease of characterization. This stands in sharp contrast with the learning-based methods that are robust to imperfect sensing and actuation, and demand (a) a high degree of resilience to allow real-world trial-and-error learning, (b) low cost and ease of maintenance to enable scalability through replication and (c) a reliable reset mechanism to alleviate strict human monitoring requirements.

In “ROBEL: Robotics Benchmarks for Learning with Low-Cost Robots”, to be presented at CoRL 2019, we introduce an open-source platform of cost-effective robots and curated benchmarks designed primarily to facilitate research and development on physical hardware in the real world. Analogous to an optical table in the field of optics, ROBEL serves as a rapid experimentation platform, supporting a wide range of experimental needs and the development of new reinforcement learning and control methods. ROBEL consists of D’Claw, a three-fingered hand robot that facilitates learning of dexterous manipulation tasks and D’Kitty, a four-legged robot that enables the learning of agile legged locomotion tasks. The robotic platforms are low-cost, modular, easy to maintain, and are robust enough to sustain on-hardware reinforcement learning from scratch.

Left: The 12 DoF D’Kitty; Middle: The 9 DoF D’Claw; Right: A functional D’Claw setup D’Lantern.

In order to make the robots relatively inexpensive and easy to build, we based ROBEL’s designs on off-the-shelf components and commonly-available prototyping tools (3D-printed or laser cut). Designs are easy to assemble and require only a few hours to build. Detailed part lists (with CAD details), assembly instructions, and software instructions for getting started are available here.

ROBEL Benchmarks
We devised a set of tasks suitable for each platform, D’Claw and D’Kitty, which can be used for benchmarking real-world robotic learning. ROBEL’s task definitions include both dense and sparse task objectives, and introduce metrics for hardware-safety in the task definition, which for example, indicate if joints are exceeding “safe” operating bounds or force thresholds. ROBEL also supports a simulator for all tasks to facilitate algorithmic development and rapid prototyping. D’Claw tasks are centered around three commonly observed manipulation behaviors — Pose, Turn, and Screw.

Left: Pose — Conform to the shape of the environment. Center: Turn — Turn the object to a specified angle. Right: Screw — Continuously rotate the object. (Click images for video.)

D’Kitty tasks are centered around three commonly observed locomotion behaviors — Stand, Orient, and Walk.

Left: Stand — Stand upright. Center: Orient — Align heading with the target. Right: Walk — Move to the target. (Click images for video.)

We evaluated several classes (on-policy, off policy, demo-accelerated, supervised) of deep reinforcement learning methods on each of these benchmark tasks. The evaluation results and the final policies are included as baselines in the software package for comparison. Full task details and baseline performances are available in the technical report.

Reproducibility & Robustness
ROBEL platforms are robust to sustain direct hardware training, and have clocked over 14,000 hours of real-world experience to-date. The platforms have significantly matured over the year. Owing to the modularity of the design, repairs are trivial and require minimal to no domain expertise, making the overall system easy to maintain.

To establish the replicability of the platforms and reproducibility of the benchmarks, ROBEL was studied in isolation by two different research labs. Only software distribution and documentation was used in this study. No in-person visits were allowed. Using ROBEL’s design files and assembly instructions both sites were able to replicate both hardware platforms. Benchmark tasks were trained on robots built at both sites. In the figure below we see that two D’Claw robots built at two different sites not only exhibit similar training progress but also converge to the same final performance, establishing reproducibility of the ROBEL benchmarks.

SAC training performance of a task on two real D’Claw robots developed at different laboratory locations.

Results Gallery
ROBEL has been useful in a variety of reinforcement learning studies so far. Below we highlight a few of the key results, and you can find all our results in this comprehensive gallery. D’Claw platforms are completely autonomous and can sustain reliable experimentation for an extended period of time, and has facilitated experimentation with a wide variety of reinforcement learning paradigms and tasks using both rigid and flexible objects.

Left: Flexible Objects — On-hardware training with DAPG effectively learns to turn flexible objects. We observe manipulation targeting the center of the valve where there is more rigidity. D’Claw is robust to on-hardware training, facilitating successful outcomes on hard to simulate tasks. Center: Disturbance Rejection — A Sim2Real policy trained via Natural Policy Gradient on MuJoCo simulation with object perturbations (amongst others) being tested on hardware. We observe fingers working together to resist external disturbances. Right: Obstructed Finger — A Sim2Real policy trained via Natural Policy Gradient on MuJoCo simulation with external perturbations (amongst others) being tested on hardware. We observe that free fingers fill in for the missing finger.

Importantly, D’Claw platforms are modular and easy to replicate, which facilitates scalable experimentation. With our scaled setup, we find that multiple D’Claws can collectively learn tasks faster by sharing experience.

On-hardware training with distributed version of SAC leaning to turn multiple objects to arbitrary angles in conjunction by sharing experience. Five tasks only need twice the amount of experience of single tasks, thanks to the multi-task formulation. In the video we observe five D’Claws turning different objects to 180 degrees (picked for visual effectiveness, actual policy can turn to any angle).

We have also been successful in deploying robust locomotion policies on the D’Kitty platform. Below we show a blind D’Kitty walking over indoor and outdoor terrains exhibiting the robustness of its gait in presence of unseen disturbances.

Left: Indoor – Walking in Clutter — A Sim2Real policy trained via Natural Policy Gradient on MuJoCo simulation with randomized perturbations learns to walk in clutter and step over objects. Center: Outdoor – Gravel and Branches — A Sim2Real policy trained via Natural Policy Gradient on MuJoCo simulation with randomized height field learns to walk outdoors over gravel and branches. Right: Outdoor – Slope and Grass — A Sim2Real policy trained via Natural Policy Gradient on MuJoCo simulation with randomized height field learns to handle moderate slopes.

When presented with information about its torso and objects present in the scene, D’Kitty can learn to interact with these objects exhibiting complex behaviors.

Left: Avoid Moving Obstacles — Policy trained via Hierarchical Sim2Real learns to avoid a moving block and reach the target (marked by the controller on the floor). Center: Push to Moving Goal — Policy trained via Hierarchical Sim2Real learns to push block towards a moving target (marked by the controller in the hand). Right: Co-ordinate — Policy trained via Hierarchical Sim2Real learns to coordinate two D’Kitties to push a heavy block towards a target (marked by two + signs on the floor).

In conclusion, ROBEL platforms are low cost, robust, reliable and are designed to accommodate the needs of the emerging learning-based paradigms that need scalability and resilience. We are proud to announce the release of ROBEL to the open source community and are excited to learn about the diversity of research and experimentation they will enable. For getting started on ROBEL platforms and ROBEL benchmarks refer to roboticsbenchmarks.org.

Acknowledgments
Google’s ROBEL D’Claw evolved from earlier designs Vikash Kumar developed at the Universities of Washington and Berkeley. Multiple people across organizations have contributed towards the ROBEL projects. We thank our co-authors Henry Zhu (UC Berkeley), Kristian Hartikainen (UC Berkeley), Abhishek Gupta (UC Berkeley) and Sergey Levine (Google and UC Berkeley) for their contributions and extensive feedback throughout the project. We would like to acknowledge Matt Neiss (Google) and Chad Richards (Google) for their significant contribution to the platform designs. We would also like to thank Aravind Rajeshwaran (U-Washington), Emo Todorov (U-Washington), and Vincent Vanhoucke (Google) for their helpful discussions and comments throughout the project.

Top Experts from Government, Industry Join to Take On Critical AI Issues at GTC DC

Influential leaders and industry experts will give an inside look at AI policy matters at GTC DC, the largest AI conference in Washington, from Nov. 4-6.

Key topics to be focused on include the national AI strategy, cybersecurity, healthcare, workforce training and diversity.

Can’t-miss AI policy panels taking place at GTC DC include:

AI in America

U.S. CTO Michael Kratsios will kick off a series of panels on AI policy with a keynote addressing how the federal government is supporting American leadership in AI.

Kratsios headed the development of the executive order on AI and leads the White House’s Select Committee on Artificial Intelligence. He’ll share updates from the administration on how the order is being implemented.

The next panel will focus on national AI strategy. Experts involved with the executive order will delve into the details of how it’s being applied, and how private citizens can bring AI to their businesses.

The panel, moderated by David Luebke, vice president of research at NVIDIA, will share firsthand knowledge of the state of federal AI adoption and the investments being made in R&D, and discuss policies that are accelerating the implementation of AI in businesses and government agencies.

Panelists include:

  • Jason Matheny, founding director at Georgetown’s Center for Security and Emerging Technology and Commissioner in the National Security AI Commission
  • Lynne Parker, assistant director for AI at the White House Office of Science and Technology
  • Elham Tabassi, chief of staff of the IT Lab at the National Institute of Standards and Technology
  • Robert Atkinson, president at the Information Technology and Innovation Foundation

Hindering the Hackers: AI and Cybersecurity

As technology improves, so do cyberattacks and massive data breaches. But cybersecurity experts will take part in a panel on how AI can help.

Moderated by Iain Cunningham, vice president of intellectual property and cybersecurity at NVIDIA, the panel features leaders in data security who will pinpoint how AI can prevent cyberattacks and how AI policy can safeguard data.

Panelists include:

  • Moira Bergin, subcommittee director, cybersecurity, infrastructure protection for the House Committee on Homeland Security
  • Coleman Mehta, senior director of U.S. policy at Palo Alto Networks
  • Daniel Kroese, associate director of the national risk management center at the Cybersecurity and Infrastructure Security Agency
  • Joshua Patterson, general manager of data science at NVIDIA

The Future Is AI

Healthcare experts will discuss how AI is changing the industry to provide better service and patient outcomes in a panel moderated by Kimberly Powell, vice president of healthcare at NVIDIA.

They’ll share examples of how they’ve built programs for AI in healthcare and present strategies for using AI to accelerate the improvement of healthcare quality, cost and access.

Panelists include:

  • Gil Alterovitz, director of AI at the U.S. Department of Veterans Affairs
  • Susan Gregurick, director of the biophysics, biomedical technology, and computational biosciences division at the National Institutes of Health
  • Jorge Cardoso, CTO at the London Medical Imaging and AI Centre

AI is also changing the future of the workforce, which business leaders will discuss in a panel moderated by Tonie Hansen, who heads corporate social responsibility at NVIDIA.

Panelists will focus on how sensible policies can help create opportunities for current and future generations of workers. They’ll share tangible advice on reskilling and upskilling employees into data science and IT roles, and preparing computer scientists for AI and machine learning, concentrating on how to do so across socioeconomic, racial and ethnic groups for a more diverse workforce.

Panelists include:

  • Laura Montoya, founder and managing partner at Accel AI
  • Charles Eaton, executive vice president of social innovation at CompTIA
  • Rhonda Foxx, former chief of staff for U.S. Representative Alma Adams of North Carolina

View descriptions of these AI policy panels in more detail on the GTC DC website and register for the conference. Media may request a complimentary pass here.

The post Top Experts from Government, Industry Join to Take On Critical AI Issues at GTC DC appeared first on The Official NVIDIA Blog.

Bird’s-AI View: How Deep Learning Helps Ornithologists Track Migration Patterns

Billions of birds in North America make the trek south each fall, migrating in pursuit of warmer winter temperatures. But at least a quarter of them don’t make it back to northern breeding grounds in the spring, falling victim to predators, weather or man-made hazards like oil pits and cell towers.

Many of these migratory birds fly under the cover of night, making it challenging for birdwatchers and ornithologists to observe them and track long-term trends. But the need to monitor avian population levels is critical.

Recent research estimates that the number of birds in North America has fallen by 3 billion in the past 50 years, impacted by climate change, habitat loss, hunting and pesticides. Spring migration has declined by 14 percent in the last decade.

To better understand how and why bird populations are changing over time, researchers at the University of Massachusetts, Amherst are using AI to analyze more than two decades of data from the national weather radar network. These insights can also improve forecasts of future bird migration and aid conservation efforts.

Two Birds with One Dataset 

A network of more than 100 weather radars has been online in the U.S. since the mid-’90s, scanning the atmosphere day and night, adding new measurements roughly every 10 minutes to a public data archive in the cloud.

While the radar network’s original purpose was to inform meteorologists, the instruments also capture flocks of birds (and even patches of insects) in flight, creating a vast trove of data for ornithologists.

Traditional methods for avian monitoring include observing and counting birds in the wild, weighing and measuring them, or tagging them with identification numbers or GPS trackers.

Radar, on the other hand, provides a detailed view of migration trends on a continental scale — giving ornithologists a way to track bird populations as they migrate thousands of miles year after year. But it’s hard to separate the signal from the noise.

When a radar image captures a flock of birds migrating across the skies, an untrained viewer may confuse the pattern for rain or snow. While both humans and AI can learn to tell the difference between birds and precipitation in radar images, using deep learning methods accelerates the process of analyzing an ever-growing dataset of more than 200 million images.

Flocking to AI 

Led by Daniel Sheldon, an associate professor of computer science, researchers at UMass Amherst used transfer learning and a dataset of 200,000 radar images from the National Weather Service to develop a neural network that could differentiate between migrating birds and precipitation.

Ph.D. student Tsung-Yu Lin (lead author on the paper) and assistant professor Subhransu Maji developed the model with support from the Cornell Lab of Ornithology.

The team used a cluster of four NVIDIA GPUs to train the deep learning model, which provides an estimate of how much biomass is present in a given radar image. From that figure, ornithologists can approximate the number of birds migrating. Named MistNet, the tool correctly identifies at least 96 percent of the birds within a test set of radar images, the researchers found.

MistNet can be run on every radar image in the public archive to summarize how much migration is occurring at different elevation levels, the direction of the birds and how fast they’re flying. Additional data sources like observations from birdwatchers or the geographic coordinates of the radar image can be used to determine which species of bird corresponds to a radar data trail.

Insights on the Horizon

The researchers have so far analyzed around 28 million scans and found that a large proportion of migration happens in a very concentrated time span. Just one night accounted for 10 percent of migration over Houston last spring.

Looking at these migration spikes over the two decades of available data could help scientists track how bird migration patterns are changing in response to climate change. The team discovered that as food becomes available earlier in the spring, bird migration dates are shifting earlier, particularly for flocks that settle in breeding grounds further north.

Since radar data is updated every few minutes, this work also can be used to project bird migration in the near term. Sheldon works with BirdCast, a collaboration among the Cornell Lab of Ornithology, UMass Amherst and Oregon State University that uses radar data to provide a real-time bird migration map, as well as three-day forecasts.

“These forecasts are exciting because they allow bird watchers to look out and see what’s going to happen, and get excited about big migration events,” he said. “But it also has significant uses in conservation.”

For example, to help birds as they fly through the night, cities could turn off distracting light sources when major migrations are forecast. Artificial lights from skyscrapers or radio towers can distract and disorient migrating birds, impairing their navigation strategies.

Main image by Frank Boston, licensed from Flickr by CC BY 2.0

The post Bird’s-AI View: How Deep Learning Helps Ornithologists Track Migration Patterns appeared first on The Official NVIDIA Blog.

Answering the Call: NVIDIA CEO to Detail How AI Will Revolutionize 5G, IoT

Highlighting the growing excitement at the intersection of AI, 5G and IoT, NVIDIA CEO Jensen Huang kicks off the Mobile World Congress Los Angeles 2019 Monday, Oct. 21.

The keynote, NVIDIA’s debut at the wireless industry’s highest-profile gathering in the U.S., will be the first of a slate of talks and training sessions from NVIDIA and its partners.

The AI revolution is spurring a wave of progress across the mobile technology industry that’s unleashing unprecedented capabilities and new opportunities.

NVIDIA is at the center of this, thanks to AI and accelerated computing capabilities that have been adopted by industries across the globe.

Jensen Huang to Deliver Agenda-Setting Keynote

Huang will detail how the latest AI and accelerated computing innovations will transform the wireless industry in a keynote that’s open to all on Monday, Oct. 21, at the Los Angeles Convention Center’s Petree Hall.

If you’re not registered for MWC-LA, RSVP for our keynote.

Get Trained with DLI

Our Deep Learning Institute — one of the largest training programs in the world for AI and accelerated computing — has partnered with the show’s sponsor, the GSMA.

Together, we’re offering hands-on training to the show’s attendees in the South Hall, booth 1743.

The training is on a first-come, first-served basis. No need to sign up in advance.

Get Inspired at NVIDIA Booth 1745

If you’re attending the event, our booth will serve as a hub for the innovations we’re bringing to the show.

At the booth, you’ll find NVIDIA Inception partners using our Metropolis platform to showcase a variety of real-world applications that demand GPUs at the edge.

Get Oriented at the NVIDIA Theater

Want to dig into the nit and grit of delivering services such as these? Stop by the NVIDIA Theater to hear speakers from NVIDIA, our partners and our customers.

Among the highlights, Saurabh Jain, director of products and strategic partnerships at NVIDIA, will detail how edge computing brings compute and storage closer to the point of action.

That’s critical for smart cities, and it’s opening up new business and service revenue opportunities for the telecom industry.

Visit NVIDIA booth 1745 at 1:30 pm on Oct. 23 to hear his talk, and stick around for others from key industry leaders.

The post Answering the Call: NVIDIA CEO to Detail How AI Will Revolutionize 5G, IoT appeared first on The Official NVIDIA Blog.

AI Space Odyssey: Deep Learning Aids Astronomers Study Galaxies

The Milky Way is on a collision course with the neighboring Andromeda galaxy. But no need to revise your will — the two star systems won’t meet for around 4 billion years.

“At some point in every galaxy’s life, it’ll undergo one of these mergers,” said William Pearson, Ph.D. student at the Netherlands Institute for Space Research and the University of Groningen, Netherlands. “It’s part of our understanding of how we think the universe works. These galaxies tend to find and crash into each other.”

Using convolutional neural networks developed on NVIDIA GPUs, Pearson is studying galaxy mergers based on both simulations and observational data from telescope images.

When two galaxies merge, the resulting fused galaxy mixes together all the gas, dust and other matter from the original star systems. Astronomers are interested in how the shape of galaxies change as a result, how the process can cause stars to form at a higher rate, and how the moving matter interacts with the supermassive black holes lying at the center of large galaxies.

By using AI to identify and analyze galaxy mergers across the universe, scientists can better understand how this phenomenon could affect our corner of the universe in the future.

Hubble Up: Analyzing Galaxy Mergers with AI 

For the most part, it’s not rocket science to visually determine whether two galaxies are in the thick of a collision.

merging galaxies in the Hercules constellation
This image, taken by the Hubble Space Telescope, shows a collision between two spiral galaxies located in the constellation of Hercules, located around 450 million light-years away from Earth. Image credit: NASA, ESA, the Hubble Heritage Team (STScI/AURA)-ESA/Hubble Collaboration and K. Noll (STScI). Licensed under CC BY 4.0.

Just looking at a telescope image, it’s easy to spot tidal tails, sweeping arcs of gas and dust being pulled from one galaxy to another by gravity.

The main challenge is classifying galaxies that are just starting to interact, or, on the other end of the spectrum, at the very final stages of a merge.

And then there’s the sheer volume of data.

Crowdsourced projects like Galaxy Zoo have relied on citizen scientists to classify a database of more than a million galaxy images from various ground-based and satellite telescopes. But that’s just a fraction of an estimated 100 billion galaxies in the universe.

And the available data is just getting larger. Projects like the under-construction Large Synoptic Survey Telescope are expected to capture images of billions of galaxies.

“There’s not enough people in the world to classify all these,” Pearson said. “As astronomers, we need another technique.”

While citizen scientist projects are a powerful tool, it still takes a long time for results to come through, he says. Deep learning models can help researchers keep pace with the many ground- and space-based telescopes busy collecting images of the universe, most of which are publicly available for analysis.

Using an NVIDIA GPU for inference, Pearson’s AI was able to categorize 300,000 galaxies in about 15 minutes. Even at an unheard-of rate of one classification per second, it would have taken an individual two working weeks to accomplish the task.

Trained using the TensorFlow deep learning framework and images from the Sloan Digital Sky Survey, the deep learning model identifies galaxies as merging or not merging with 92 percent accuracy. Pearson hopes for future versions of the CNN to look at more specific details, such as the size of the galaxies and how far along the merging process is.

From this data, researchers can make statistical assessments of broad trends in galaxy mergers — or take a closer look at specific galaxies of interest.

Main image shows two merging galaxies, nicknamed “The Mice,” located 300 million light-years away. Image credit: NASA, Holland Ford (JHU), the ACS Science Team and ESA. Licensed under CC BY 4.0.

The post AI Space Odyssey: Deep Learning Aids Astronomers Study Galaxies appeared first on The Official NVIDIA Blog.

Managing conversation flow with a fallback intent on Amazon Lex

Ever been stumped by a question? Imagine you’re in a business review going over weekly numbers and someone asks, “What about expenses?” Your response might be, “I don’t know. I wasn’t prepared to have that discussion right now.”

Bots aren’t fortunate enough to have the same comprehension capabilities, so how should they respond when they don’t have an answer? How can a bot recover when it doesn’t have the response? Asking you to repeat yourself could be quite frustrating if the bot still doesn’t understand. Perhaps it can pretend to understand what you said based on the last exchange? That might not always work and could also sound foolish. Maybe the bot can admit its limitations and tell you what it can do? That would be acceptable the first few times but can be suboptimal in the long run.

There is no single correct way. Conversation repair strategies vary by the kind of experience you’re trying to create. You can use error handling prompts. The bot would try to clarify by prompting “Sorry, can you please say that again?” a few times before hanging up with a message such as, “I am not able to assist you at this time.”  Building on the sample conversation above, let us first build a simple chatbot to answer questions related to revenue numbers. This bot answers questions such as “What’s the revenue in Q1?”, “What were our sales in western region?” The Lex bot contains only two intents: RegionDetails and QuarterDetails. With this bot definition, if someone were to discuss expenses (“How much did we spend last quarter?”), the bot would go through the clarification prompts and eventually hang up. You couldn’t intervene or execute business logic. The conversation would resemble the following:

Starting today, you can add fallback intent to help your bot recover gracefully in such situations. With a fallback intent, you can now control the bot’s recovery by providing additional information, managing dialog, or executing business logic. You can control the conversation better and manage the flow for an ideal outcome, such as the following:

Configuring the fallback intent

You can configure your fallback intent by completing the following steps.

  1. From the Amazon Lex console, choose Create intent.
  2. Search for AMAZON.Fallback in the existing intents.

See the following screenshot of the BusinessMetricsFallback page:

If you have any clarification prompts the Fallback intent will be triggered after the clarification prompts are executed. We recommend disabling the clarification prompts. Hang up phrase are not used when Fallback is configured. See the following screenshot of the Error handling page:

  1. Add an intent ContactDetails to collect the email ID.

This is a simple intent with just the email address as a slot type. Please review the bot definition for intent details.

  1. Add an AWS Lambda function in the fulfillment code hook of the fallback intent.

This function performs two operations. First, it creates a task (for example, a ticket entry in a database) to record your request for an operator follow-up. Second, it switches the intent to elicit additional information, such as your email ID, so that a response goes out after an operator has processed the query. Please review the Lambda definition for code details.

With the preceding bot definition, you can now control the conversation. When you ask “How much did we spend last quarter,” the input does not match any of the configured intents, and triggers the fallback intent. The fulfillment code hook of the Lambda creates the ticket and switches the intent to ContactDetails to capture the email ID.

Summary

This post demonstrated how to have better control of the conversation flow with a fallback intent. You can switch intents, execute business logic, or provide custom responses. For more information about incorporating these techniques into real bots, see the Amazon Lex documentation.

 

 


About the Author

Kartik Rustagi works as a Software Development Manager in Amazon AI. He and his team focus on enhancing the conversation capability of chat bots powered by Amazon Lex. When not at work, he enjoys exploring the outdoors and savoring different cuisines.

 

 

 

 

Improving Quantum Computation with Classical Machine Learning

One of the primary challenges for the realization of near-term quantum computers has to do with their most basic constituent: the qubit. Qubits can interact with anything in close proximity that carries energy close to their own—stray photons (i.e., unwanted electromagnetic fields), phonons (mechanical oscillations of the quantum device), or quantum defects (irregularities in the substrate of the chip formed during manufacturing)—which can unpredictably change the state of the qubits themselves.

Further complicating matters, there are numerous challenges posed by the tools used to control qubits. Manipulating and reading out qubits is performed via classical controls: analog signals in the form of electromagnetic fields coupled to a physical substrate in which the qubit is embedded, e.g., superconducting circuits. Imperfections in these control electronics (giving rise to white noise), interference from external sources of radiation, and fluctuations in digital-to-analog converters, introduce even more stochastic errors that degrade the performance of quantum circuits. These practical issues impact the fidelity of the computation and thus limit the applications of near-term quantum devices.

To improve the computational capacity of quantum computers, and to pave the road towards large-scale quantum computation, it is necessary to first build physical models that accurately describe these experimental problems.

In “Universal Quantum Control through Deep Reinforcement Learning”, published in Nature Partner Journal (npj) Quantum Information, we present a new quantum control framework generated using deep reinforcement learning, where various practical concerns in quantum control optimization can be encapsulated by a single control cost function. Our framework provides a reduction in the average quantum logic gate error of up to two orders-of-magnitude over standard stochastic gradient descent solutions and a significant decrease in gate time from optimal gate synthesis counterparts. Our results open a venue for wider applications in quantum simulation, quantum chemistry and quantum supremacy tests using near-term quantum devices.

The novelty of this new quantum control paradigm hinges upon the development of a quantum control function and an efficient optimization method based on deep reinforcement learning. To develop a comprehensive cost function, we first need to develop a physical model for the realistic quantum control process, one where we are able to reliably predict the amount of error. One of the most detrimental errors to the accuracy of quantum computation is leakage: the amount of quantum information lost during the computation. Such information leakage usually occurs when the quantum state of a qubit gets excited to a higher energy state, or decays to a lower energy state through spontaneous emission. Leakage errors not only lose useful quantum information, they also degrade the “quantumness” and eventually reduce the performance of a quantum computer to that of a classical one.

A common practice to accurately evaluate the leaked information during the quantum computation is to simulate the whole computation first. However, this defeats the purpose of building large-scale quantum computers, since their advantage is that they are able to perform calculations infeasible for classical systems. With improved physical modeling, our generic cost function enables a joint optimization over the accumulated leakage errors, violations of control boundary conditions, total gate time, and gate fidelity.

With the new quantum control cost function in hand, the next step is to apply an efficient optimization tool to minimize it. Existing optimization methods turn out to be unsatisfactory in finding high fidelity solutions that are also robust to control fluctuations. Instead, we apply an on-policy deep reinforcement learning (RL) method, trusted-region RL, since this method exhibits good performance in all benchmark problems, is inherently robust to sample noise, and has the capability to optimize hard control problems with hundreds of millions of control parameters. The salient difference between this on-policy RL from previously studied off-policy RL methods is that the control policy is represented independently from the control cost. Off-policy RL, such as Q-learning, on the other hand, uses a single neural network (NN) to represent both the control trajectory, and the associated reward, where the control trajectory specifies the control signals to be coupled to qubits at different time steps, and the associated award evaluates how good the current step of the quantum control is.

On-policy RL is well known for its ability to leverage non-local features in control trajectories, which becomes crucial when the control landscape is high-dimensional and packed with a combinatorially large number of non-global solutions, as is often the case for quantum systems.

We encode the control trajectory into a three-layer, fully connected NN—the policy NN—and the control cost function into a second NN—the value NN—which encodes the discounted future reward. Robust control solutions were obtained by reinforcement learning agents, which trains both NNs under a stochastic environment that mimics a realistic noisy control actuation. We provide control solutions to a set of continuously parameterized two-qubit quantum gates that are important for quantum chemistry applications but are costly to implement using the conventional universal gate set.

Under this new framework, our numerical simulations show a 100x reduction in quantum gate errors and reduced gate times for a family of continuously parameterized simulation gates by an average of one order-of-magnitude over traditional approaches using a universal gate set.

This work highlights the importance of using novel machine learning techniques and near-term quantum algorithms that leverage the flexibility and additional computational capacity of a universal quantum control scheme. More experiments are needed to integrate machine learning techniques, such as the one developed in this work, into practical quantum computation procedures to fully improve its computational capacity through machine learning.