Skip to main content

Blog

Learn About Our Meetup

5000+ Members

MEETUPS

LEARN, CONNECT, SHARE

Join our meetup, learn, connect, share, and get to know your Toronto AI community. 

JOB POSTINGS

INDEED POSTINGS

Browse through the latest deep learning, ai, machine learning postings from Indeed for the GTA.

CONTACT

CONNECT WITH US

Are you looking to sponsor space, be a speaker, or volunteer, feel free to give us a shout.

Category: Global

Pod Squad: Descript Uses AI to Make Managing Podcasts Quicker, Easier

You can’t have an AI podcast and not interview someone using AI to make podcasts better.

That’s why we reached out to serial entrepreneur Andrew Mason to talk to him about what he’s doing now. His company, Descript Podcast Studio, uses AI, natural language processing and automatic speech synthesis to make podcast editing easier and more collaborative.

Mason, Descript’s CEO and perhaps best known as Groupon’s founder, spoke with AI Podcast host Noah Kravitz about his company and the newest beta service it offers, called Overdub.

 

Key Points From This Episode

  • Descript works like a collaborative word processor. Users record audio, which Descript converts to text. They can then edit and rearrange text, and the program will change the audio.
  • Overdub, created in collaboration with Descript’s AI research division, eliminates the need to re-record audio. Type in new text, and Overdub creates audio in the user’s voice.
  • Descript 3.0 launched in November, adding new features such as a detector that can identify and remove vocalized pauses like “um” and “uh” as well as silence.

Tweetables

“We’re trying to use AI to automate the technical heavy lifting components of learning to use editors — as opposed to automating the craft — and we leave space for the user to display and refine their craft” — Andrew Mason [07:10]

“What’s really unique to us is a kind of tonal or prosodic connecting of the dots, where we’ll analyze the audio before and after whatever you’re splicing in with Overdub, and make sure that it sounds continuous in a natural transition” — Andrew Mason [10:30]

You Might Also Like

The Next Hans Zimmer? How AI May Create Music for Video Games, Exercise Routines

Imagine Wolfgang Amadeus Mozart as an algorithm or the next Hans Zimmer as a computer. Pierre Barreau and his startup, Aiva Technologies, are using deep learning to compose music. Their algorithm can create a theme in four minutes flat.

How Deep Learning Can Translate American Sign Language

Rochester Institute of Technology computer engineering major Syed Ahmed, a research assistant at the National Technical Institute for the Deaf, uses AI to translate between American sign language and English. Ahmed trained his algorithm on 1,700 sign language videos.

Tune in to the AI Podcast

Get the AI Podcast through iTunesGoogle PodcastsGoogle PlayCastbox, DoggCatcher, OvercastPlayerFM, Pocket Casts, PodbayPodBean, PodCruncher, PodKicker, SoundcloudSpotifyStitcher and TuneIn.

  

Make Our Podcast Better

Have a few minutes to spare? Fill out this short listener survey. Your answers will help us make a better podcast.

The post Pod Squad: Descript Uses AI to Make Managing Podcasts Quicker, Easier appeared first on The Official NVIDIA Blog.

AWS announces the Machine Learning Embark program to help customers train their workforce in machine learning

Today at AWS re:Invent 2019, I’m excited to announce the AWS Machine Learning (ML) Embark program to help companies transform their development teams into machine learning practitioners. AWS ML Embark is based on Amazon’s own experience scaling the use of machine learning inside its own operations as well as the lessons learned through thousands of successful customer implementations. Elements of the program include guided instruction from AWS machine learning experts, a discovery workshop, hand-selected curriculum from the Machine Learning University, an AWS DeepRacer event, and co-development of a machine learning proof of concept at the culmination of the program.

Customers I talk to are eager to get started implementing machine learning in their organizations, but it can be difficult to know where to begin. And, once started, it can be challenging to gain meaningful adoption across the organization. More often, customers are not asking “why” machine learning, but “how.” It’s a cultural shift as much as a technical one. Success involves inspiring and motivating teams to get interested in machine learning, identifying the most impactful projects to tackle, and developing a workforce with the right skills. And, teams new to machine learning need guidance and expertise from more seasoned data scientists who are in short supply. As a result, organizations can often feel like turning the corner on machine learning adoption happens at a glacial pace.

The AWS ML Embark program is designed to help these customers overcome some common challenges in the machine learning journey. To kick off the program, participants will pair their business and technical staff with AWS machine learning experts to join a discovery day workshop to identify a business problem well suited for machine learning. Through this exercise, AWS machine learning experts will help the group work backwards from a problem and align on where machine learning can have meaningful impact.

Next, this cross-functional group will participate in instructor-led, on-site trainings with curriculum modeled after Amazon’s Machine Learning University, which has been refined over the last several years to help Amazon’s own developers become proficient in machine learning. Participants will benefit from hand-selected coursework focused on practical application relevant to their business use cases. At the completion of the training, the AWS ML Embark program offers the option to continue education online and take the AWS Certified Machine Learning – Specialty certification exam to validate their skills.

AWS ML Embark also includes a corporate AWS DeepRacer event to expose a broader group of employees to machine learning with friendly competition and hands-on experience through racing fully autonomous 1/18th scale race cars using reinforcement learning.

Finally, experts from the Amazon ML Solutions Lab mentor participants through the ideation, development, and launch of a proof of concept based on a use case identified in the discovery day workshop. Through the process, the team will gain insight into best practices, ways to avoid costly mistakes, and knowledge based on the overall experience of working with experts who have completed hundreds of machine learning implementations.

At the conclusion of the program, a customer is well prepared to begin scaling newly obtained machine learning capabilities throughout their organization to take on additional machine learning projects and solve new challenges across their business. We’re excited to help customers begin their machine learning journey and can’t wait to see what they’ll do after graduation. Nominations for the program are now being accepted.

 


About the Author

Michelle Lee is vice president of the Machine Learning Solutions Lab at AWS.

 

 

AWS Outposts Station a GPU Garrison in Your Data Center

All the goodness of GPU acceleration on Amazon Web Services can now also run inside your own data center.

AWS Outposts powered by NVIDIA T4 Tensor Core GPUs are generally available starting today. They bring cloud-based Amazon EC2 G4 instances inside your data center to meet user requirements for security and latency in a wide variety of AI and graphics applications.

With this new offering, AI is no longer a research project.

Most companies still keep their data inside their own walls because they see it as their core intellectual property. But for deep learning to transition from research into production, enterprises need the flexibility and ease of development the cloud offers — right beside their data. That’s a big part of what AWS Outposts with T4 GPUs now enables.

With this new offering, enterprises can install a fully managed rack-scale appliance next to the large data lakes stored securely in their data centers.

AI Acceleration Across the Enterprise

To train neural networks, every layer of software needs to be optimized, from NVIDIA drivers to container runtimes and application frameworks. AWS services like Sagemaker, Elastic MapReduce and many others designed on custom-built Amazon Machine Images require model development to start with the training on large datasets. With the introduction of NVIDIA-powered AWS Outposts, those services can now be run securely in enterprise data centers.

The GPUs in Outposts accelerate deep learning as well as high performance computing and other GPU applications. They all can access software in NGC, NVIDIA’s hub for GPU-accelerated software optimization, which is stocked with applications, frameworks, libraries and SDKs that include pre-trained models.

For AI inference, the NVIDIA EGX edge-computing platform also runs on AWS Outposts and works with the AWS Elastic Kubernetes Service. Backed by the power of NVIDIA T4 GPUs, these services are capable of processing orders of magnitudes more information than CPUs alone. They can quickly derive insights from vast amounts of data streamed in real time from sensors in an Internet of Things deployment whether it’s in manufacturing, healthcare, financial services, retail or any other industry.

On top of EGX, the NVIDIA Metropolis application framework provides building blocks for vision AI, geared for use in smart cities, retail, logistics and industrial inspection, as well as other AI and IoT use cases, now easily delivered on AWS Outposts.

Alternatively, the NVIDIA Clara application framework is tuned to bring AI to healthcare providers whether it’s for medical imaging, federated learning or AI-assisted data labeling.

The T4 GPU’s Turing architecture uses TensorRT to accelerate the industry’s widest set of AI models. Its Tensor Cores support multi-precision computing that delivers up to 40x more inference performance than CPUs.

Remote Graphics, Locally Hosted

Users of high-end graphics have choices, too. Remote designers, artists and technical professionals who need to access large datasets and models can now get both cloud convenience and GPU performance.

Graphics professionals can benefit from the same NVIDIA Quadro technology that powers most of the world’s professional workstations not only on the public AWS cloud, but on their own internal cloud now with AWS Outposts packing T4 GPUs.

Whether they’re working locally or in the cloud, Quadro users can access the same set of hundreds of graphics-intensive, GPU-accelerated third-party applications.

The Quadro Virtual Workstation AMI, available in AWS Marketplace, includes the same Quadro driver found on physical workstations. It supports hundreds of Quadro-certified applications such as Dassault Systèmes SOLIDWORKS and CATIA; Siemens NX; Autodesk AutoCAD and Maya; ESRI ArcGIS Pro; and ANSYS Fluent, Mechanical and Discovery Live.

Learn more about AWS and NVIDIA offerings and check out our booth 1237 and session talks at AWS re:Invent.

The post AWS Outposts Station a GPU Garrison in Your Data Center appeared first on The Official NVIDIA Blog.

Amazon Web Services achieves fastest training times for BERT and Mask R-CNN

Two of the most popular machine learning models used today are BERT, for natural language processing (NLP), and Mask R-CNN, for image recognition. Over the past several months, AWS has significantly improved the underlying infrastructure, network, machine learning (ML) framework, and model code to achieve the best training time for these two popular state-of-the-art models. Today, we are excited to share the world’s fastest model training times to date on the cloud on TensorFlow, MXNet, and PyTorch. You can now use these hardware and software optimizations to train your TensorFlow, MXNet, and PyTorch models with the same speed and efficiency.

Model training time directly impacts your ability to iterate and improve on the accuracy of your models quickly. The primary way to reduce training time is by distributing the training job across a large cluster of GPU instances, but this is hard to do efficiently. If you distribute a training job across a large number of workers, you often have rapidly diminishing returns because the overhead in communication between instances begins to cancel out the additional GPU computing power.

BERT

BERT, or Bidirectional Encoder Representations from Transformers, is a popular NLP model, which at the time it was published was state-of-the-art on several common NLP tasks.

On a single Amazon EC2 P3dn.24xlarge instance, which has 8 NVIDIA V100 GPUs, it takes approximately three days to train BERT from scratch with TensorFlow and PyTorch. We reduced training time from three days to slightly over 60 minutes by efficiently scaling out to more P3dn.24xlarge instances, using network improvements with Elastic Fabric Adapter (EFA), and optimizing how this complex model converges on larger clusters. As of this writing, this is the fastest time-to-train for BERT on the cloud while achieving state-of-the-art target accuracy (F1 score of 90.4 or higher on Squad v2 tasks after training on BooksCorpus and English Wikipedia).

With TensorFlow, we achieved unprecedented scale with 2,048 GPUs on 256 P3dn.24xlarge instances to train BERT in 62 minutes. With PyTorch, we reduced training time to 69 minutes by scaling out to 1,536 GPUs on 192 P3dn.24xlarge instances. With all our optimizations to the entire hardware and software stack for training BERT, we achieved an 85% scaling efficiency, which makes sure the frameworks can use most of the additional computation power from GPUs when scaling to more P3dn.24xlarge nodes. The following table summarizes these improvements.

P3DN.24xlarge Nodes NVIDIA GPUs Time to train (PyTorch) Time to train (TensorFlow)
1 8 3 days 3 days
192 1536 69 min
256 2048 62 min

Mask R-CNN

Mask R-CNN is a widely used instance segmentation model that is used for autonomous driving, motion capture, and other uses that require sophisticated object detection and segmentation capabilities.

It takes approximately 80 hours to train Mask R-CNN on a single P3dn.24xlarge instance (8 NVIDIA V100 GPUs) with MXNet, PyTorch, and TensorFlow. We reduced training time from 80 hours to approximately 25 minutes on MXNet, PyTorch, and TensorFlow. We scaled Mask R-CNN training on all three ML frameworks to 24 P3dn.24xlarge instances, which gave us 192 GPUs. You can now rapidly iterate and run several experiments daily instead of waiting several days for results. As of this writing, this is the fastest time-to-train for Mask R-CNN on the cloud, while achieving state-of-the-art target accuracy (0.377 Box min AP, 0.339 Mask min AP on COCO2017 dataset). The following table summarizes these improvements.

# of Nodes # of GPUs Time to train (MXNet) Time to train (PyTorch) Time to train (TensorFlow)
1 8 ~80 hrs ~80 hrs ~80 hrs
24 192 25 min 26 min 27 min

Technology stack

Achieving these results required optimizations to the underlying hardware, networking, and software stack . When training large models such as BERT, communication among the many GPUs in use becomes a bottleneck.

In distributed computing (large-scale training being one instance of it), AllReduce is an operation that reduces arrays (parameters of a neural network in this case) from different workers (GPUs) and returns the resultant array to all workers (GPUs). GPUs collectively perform an AllReduce operation after every iteration. Each iteration consists of one forward and backward pass through the network.

The most common approach to perform AllReduce on GPUs is to use NVIDIA Collective Communications Library (NCCL) or MPI libraries such as OpenMPI or Intel MPI Library. These libraries are designed for homogeneous clusters. AllReduce happens on the same instances that train the network. The AllReduce algorithm on homogeneous clusters involves each worker sending and receiving data approximately twice the size of the model for each AllReduce operation. For example, the AllReduce operation for BERT (which has 340 million parameters) involves sending approximately 650 MB of half-precision data twice and receiving the same amount of data twice. This communication needs to happen after every iteration and quickly becomes a bottleneck when training most models.

The choice of AllReduce algorithm usually depends on the network architecture. For example, Ring-AllReduce is a good choice for a network in which each node is connected to two neighbors, which forms a ring. Torus AllReduce algorithm is a good choice for a network in which each node is connected to four neighbors, which forms a 2D rectangular lattice. AWS uses a much more flexible interconnect, in which any node can communicate with any other node at full bandwidth. For example, in a cluster with 128 P3dn instances, any instance can communicate with any other instance at 100 Gbps.

Also, the 100 Gbps interconnect is not limited to P3dn instances. You can add CPU-optimized C5n instances to the cluster and still retain the 100 Gbps interconnect between any pair of nodes.

This high flexibility of the AWS interconnect begs for an AllReduce algorithm that makes full use of the unique capabilities of the AWS interconnect. We therefore developed a custom AllReduce algorithm optimized for the AWS network. The custom AllReduce algorithm exploits the 100 Gbps interconnect between any pair of nodes in a heterogeneous cluster and reduces the amount of data sent and received by each worker by half. The compute phase of the AllReduce algorithm is offloaded onto compute-optimized C5 instances, freeing up the GPUs to compute gradients faster. Because GPU instances don’t perform the reduction operation, sending gradients and receiving AllReduced gradients can happen in parallel. The number of hops required to AllReduce gradients is reduced to just two compared to homogeneous AllReduce algorithms, in which the number of network hops increases with the number of nodes. The total cost is also reduced because training completes much faster compared to training with only P3dn nodes.

Conclusion

When tested with BERT and Mask R-CNN, the results yielded significant improvements to single-node executions. Throughput scaled almost linearly as the number of P3dn nodes scaled from 1 to 16, 32, 64, 128, 192, and 256 instances, which ultimately helped to reduce model training times by scaling to additional P3dn.24xlarge instances without increasing cost. With these optimizations, AWS can now offer you the fastest model training times on the cloud for state-of-the-art computer vision and NLP models.

Get started with TensorFlow, MXNet, and PyTorch today on Amazon SageMaker.


About the authors

Aditya Bindal is a Senior Product Manager for AWS Deep Learning. He works on products that make it easier for customers to use deep learning engines. In his spare time, he enjoys playing tennis, reading historical fiction, and traveling.

 

 

 

Kevin Haas is the engineering leader for the AWS Deep Learning team, focusing on providing performance and usability improvements for AWS machine learning customers. Kevin is very passionate about lowering the friction for customers to adopt machine learning in their software applications. Outside of work, he can be found dabbling with open source software and volunteering for the Boy Scouts.

 

 

 

Indu Thangakrishnan is a Software Development Engineer at AWS. He works on training deep neural networks faster. In his spare time, he enjoys binge-listening Audible and playing table tennis.

 

 

 

 

 

 

Developing Deep Learning Models for Chest X-rays with Adjudicated Image Labels

With millions of diagnostic examinations performed annually, chest X-rays are an important and accessible clinical imaging tool for the detection of many diseases. However, their usefulness can be limited by challenges in interpretation, which requires rapid and thorough evaluation of a two-dimensional image depicting complex, three-dimensional organs and disease processes. Indeed, early-stage lung cancers or pneumothoraces (collapsed lungs) can be missed on chest X-rays, leading to serious adverse outcomes for patients.

Advances in machine learning (ML) present an exciting opportunity to create new tools to help experts interpret medical images. Recent efforts have shown promise in improving lung cancer detection in radiology, prostate cancer grading in pathology, and differential diagnoses in dermatology. For chest X-ray images in particular, large, de-identified public image sets are available to researchers across disciplines, and have facilitated several valuable efforts to develop deep learning models for X-ray interpretation. However, obtaining accurate clinical labels for the very large image sets needed for deep learning can be difficult. Most efforts have either applied rule-based natural language processing (NLP) to radiology reports or relied on image review by individual readers, both of which may introduce inconsistencies or errors that can be especially problematic during model evaluation. Another challenge involves assembling datasets that represent an adequately diverse spectrum of cases (i.e., ensuring inclusion of both “hard” cases and “easy” cases that represent the full spectrum of disease presentation). Finally, some chest X-ray findings are non-specific and depend on clinical information about the patient to fully understand their significance. As such, establishing labels that are clinically meaningful and have consistent definitions can be a challenging component of developing machine learning models that use only the image as input. Without standardized and clinically meaningful datasets as well as rigorous reference standard methods, successful application of ML to interpretation of chest X-rays will be hindered.

To help address these issues, we recently published “Chest Radiograph Interpretation with Deep Learning Models: Assessment with Radiologist-adjudicated Reference Standards and Population-adjusted Evaluation” in the journal Radiology. In this study we developed deep learning models to classify four clinically important findings on chest X-rays — pneumothorax, nodules and masses, fractures, and airspace opacities. These target findings were selected in consultation with radiologists and clinical colleagues, so as to focus on conditions that are both critical for patient care and for which chest X-ray images alone are an important and accessible first-line imaging study. Selection of these findings also allowed model evaluation using only de-identified images without additional clinical data.

Models were evaluated using thousands of held-out images from each dataset for which we collected high-quality labels using a panel-based adjudication process among board-certified radiologists. Four separate radiologists also independently reviewed the held-out images in order to compare radiologist accuracy to that of the deep learning models (using the panel-based image labels as the reference standard). For all four findings and across both datasets, the deep learning models demonstrated radiologist-level performance. We are sharing the adjudicated labels for the publicly available data here to facilitate additional research.

Data Overview
This work leveraged over 600,000 images sourced from two de-identified datasets. The first dataset was developed in collaboration with co-authors at the Apollo Hospitals, and consists of a diverse set of chest X-rays obtained over several years from multiple locations across the Apollo Hospitals network. The second dataset is the publicly available ChestX-ray14 image set released by the National Institutes of Health (NIH). This second dataset has served as an important resource for many machine learning efforts, yet has limitations stemming from issues with the accuracy and clinical interpretation of the currently available labels.

Chest X-ray depicting an upper left lobe pneumothorax identified by the model and the adjudication panel, but missed by the individual radiologist readers. Left: The original image. Right: The same image with the most important regions for the model prediction highlighted in orange.

Training Set Labels Using Deep Learning and Visual Image Review
For very large datasets consisting of hundreds of thousands of images, such as those needed to train highly accurate deep learning models, it is impractical to manually assign image labels. As such, we developed a separate, text-based deep learning model to extract image labels using the de-identified radiology reports associated with each X-ray. This NLP model was then applied to provide labels for over 560,000 images from the Apollo Hospitals dataset used for training the computer vision models.

To reduce noise from any errors introduced by the text-based label extraction and also to provide the relevant labels for a substantial number of the ChestX-ray14 images, approximately 37,000 images across the two datasets were visually reviewed by radiologists. These were separate from the NLP-based labels and helped to ensure high quality labels across such a large, diverse set of training images.

Creating and Sharing Improved Reference Standard Labels
To generate high-quality reference standard labels for model evaluation, we utilized a panel-based adjudication process, whereby three radiologists reviewed all final tune and test set images and resolved disagreements through discussion. This often allowed difficult findings that were initially only detected by a single radiologist to be identified and documented appropriately. To reduce the risk of bias based on any individual radiologist’s personality or seniority, the discussions took place anonymously via an online discussion and adjudication system.

Because the lack of available adjudicated labels was a significant initial barrier to our work, we are sharing with the research community all of the adjudicated labels for the publicly available ChestX-ray14 dataset, including 2,412 training/validation set images and 1,962 test set images (4,374 images in total). We hope that these labels will facilitate future machine learning efforts and enable better apples-to-apples comparisons between machine learning models for chest X-ray interpretation.

Future Outlook
This work presents several contributions: (1) releasing adjudicated labels for images from a publicly available dataset; (2) a method to scale accurate labeling of training data using a text-based deep learning model; (3) evaluation using a diverse set of images with expert-adjudicated reference standard labels; and ultimately (4) radiologist-level performance of deep learning models for clinically important findings on chest X-rays.

However, in regards to model performance, achieving expert-level accuracy on average is just a part of the story. Even though overall accuracy for the deep learning models was consistently similar to that of radiologists for any given finding, performance for both varied across datasets. For example, the sensitivity for detecting pneumothorax among radiologists was approximately 79% for the ChestX-ray14 images, but was only 52% for the same radiologists on the other dataset, suggesting a more difficult collection cases in the latter. This highlights the importance of validating deep learning tools on multiple, diverse datasets and eventually across the patient populations and clinical settings in which any model is intended to be used.

The performance differences between datasets also emphasize the need for standardized evaluation image sets with accurate reference standards in order to allow comparison across studies. For example, if two different models for the same finding were evaluated using different datasets, comparing performance would be of minimal value without knowing additional details such as the case mix, model error modes, or radiologist performance on the same cases.

Finally, the model often identified findings that were consistently missed by radiologists, and vice versa. As such, strategies that combine the unique “skills” of both the deep learning systems and human experts are likely to hold the most promise for realizing the potential of AI applications in medical image interpretation.

Acknowledgements
Key contributors to this project at Google include Sid Mittal, Gavin Duggan, Anna Majkowska, Scott McKinney, Andrew Sellergren, David Steiner, Krish Eswaran, Po-Hsuan Cameron Chen, Yun Liu, Shravya Shetty, and Daniel Tse. Significant contributions and input were also made by radiologist collaborators Joshua Reicher, Alexander Ding, and Sreenivasa Raju Kalidindi. The authors would also like to acknowledge many members of the Google Health radiology team including Jonny Wong, Diego Ardila, Zvika Ben-Haim, Rory Sayres, Shahar Jamshy, Shabir Adeel, Mikhail Fomitchev, Akinori Mitani, Quang Duong, William Chen and Sahar Kazemzadeh. Sincere appreciation also goes to the many radiologists who enabled this work through their expert image interpretation efforts throughout the project.

Introducing medical speech-to-text with Amazon Transcribe Medical

We are excited to announce Amazon Transcribe Medical, a new HIPAA-eligible, machine learning automatic speech recognition (ASR) service that allows developers to add medical speech-to-text capabilities to their applications. Transcribe Medical provides accurate and affordable medical transcription, enabling healthcare providers, IT vendors, insurers, and pharmaceutical companies to build services that help physicians, nurses, researchers, and claims agents improve the efficiency of medical note-taking. With this new service, you can complete clinical documentation faster, more accurately, securely, and with less effort, whether in a clinician’s office, research lab, or on the phone with an insurance claim agent.

The challenge in healthcare and life sciences

Following the implementation of the HITECH Act (Health Information Technology for Economic and Clinical Health), physicians are required to conduct detailed data entry into electronic health record (EHR) systems. However, clinicians can spend up to an average of six additional hours per day, on top of existing medical tasks, just writing notes for EHR data entry. Not only is the process time consuming and exhausting for physicians, it is a leading factor of workplace burnout and stress that distracts physicians from engaging patients attentively, resulting in poorer patient care and rushed visits. While medical scribes have been employed to assist with manual note taking, the solution is expensive, difficult to scale across thousands of medical facilities, and some patients find the presence of a scribe uncomfortable, leading to less candid discussions about symptoms. Existing front-end dictation software also requires physicians to take training or speak unnaturally, such as explicitly calling out punctuation, which is disruptive and inefficient. Additionally, to help alleviate the burden of note taking, many healthcare providers send physicians’ voice notes to manual transcription services, which can have turnaround times between one and three business days.  Additionally, as the life sciences industry moves toward precision medicine, pharmaceutical companies increasingly need to collect real-world evidence (RWE) of their medications’ efficacy or cause potential side effects. However, RWE is often acquired during phone calls that then need to be transcribed.

Transcribe Medical helps address these challenges by enabling an accurate, affordable, and easy-to-use medical speech recognition service.

Improving medical transcription with machine learning

Transcribe Medical offers an easy-to-use streaming API that integrates into any voice-enabled application and works with virtually any device that has a microphone. The service is designed to transcribe medical speech for primary care and can be deployed at scale across thousands of healthcare providers to provide affordable, consistent, secure, and accurate note-taking for clinical staff and facilities. Additional features like automatic punctuation and intelligent punctuation enable you to speak naturally, without having to vocalize awkward and explicit punctuation commands as part of the discussion, such as “add comma” or “exclamation point.” Moreover, the service supports both medical dictation and conversational transcription.

Transcribe Medical is a fully managed ASR service and therefore requires no provisioning or management of servers. You need only to call the public API and start passing an audio stream to the service over a secure WebSocket connection. Transcribe Medical sends back a stream of text in real time. Experience in machine learning isn’t required to use Transcribe Medical, and the service is covered under AWS’s HIPAA eligibility and BAA. Any customer that enters into a BAA with AWS can use AWS HIPAA-eligible services to process and store PHI. AWS does not use PHI to develop or improve AWS services, and the AWS standard BAA language also requires BAA customers to encrypt all PHI at rest and in transit when using AWS services.

Bringing the focus back to patients

Cerner is a healthcare IT leader whose EHR systems are deployed in thousands of medical centers across the world. Using Amazon Transcribe Medical, Cerner’s voice scribe application can automatically and securely transcribe physicians’ conversations with patients. They can analyze and summarize the transcripts for important health information, and physicians can enter the information into the EHR system. The result is increased note-taking efficiency, lower physician burnout, and ultimately improved patient care and satisfaction.

“Extreme accuracy in clinical documentation is critical to workflows and overall caregiver satisfaction. By leveraging Amazon Transcribe Medical’s transcription API, Cerner is in initial development of a digital voice scribe that automatically listens to clinician-patient interactions and unobtrusively captures the dialogue in text form. From there, our solution will intelligently translate the concepts for entry into the codified component in the Cerner EHR system,” said Jacob Geers, solutions strategist at Cerner Corporation.

We’ve also heard from other customers using Transcribe Medical. For example, George Seegan, senior data scientist at Amgen, said, “In pharmacovigilance, we want to accurately review recorded calls from patients or Health Care Providers, in order to identify any reported potential side effects associated with medicinal products. Amazon Transcribe Medical produces text transcripts from recorded calls that allow us to extract meaningful insights about medicines and any reported side effects. In this way, we can quickly detect, collect, assess, report, and monitor adverse effects to the benefit of patients globally.”

Vadim Khazan, president of technology at SoundLines, said, “SoundLines supports HealthChannels’ skilled labor force at the point of care by incorporating meaningful and impact focused technology such as Amazon’s medically accurate speech-to-text solution, Amazon Transcribe Medical, into the workflows of our network of Care Team Assistants, Medical Scribes, and Navigators. This allows the clinicians we serve to focus on patients instead of spending extra cycles on manual note-taking. Amazon Transcribe Medical easily integrates into the HealthChannels platform, feeding its transcripts directly into downstream analytics. For the 3,500 healthcare partners relying on our care team optimization strategies for the past 15 years, we’ve significantly decreased the time and effort required to get to insightful data.”

Improving patient care and satisfaction is a passion that AWS shares with our healthcare and life science customers. We’re incredibly excited about the role that Transcribe Medical can play in supporting that mission.

 


About the author

Vasi Philomin is the GM for Machine Learning & AI at AWS, responsible for Amazon Lex, Polly, Transcribe, Translate and Comprehend.

 

 

 

 

 

Healthcare Regulators Open the Tap for AI

Approvals for AI-based healthcare products are streaming in from regulators around the globe, with medical imaging leading the way.

It’s just the start of what’s expected to become a steady flow as submissions rise and the technology becomes better understood.

More than 90 medical imaging products using AI are now cleared for clinical use, thanks to approvals from at least one global regulator, according to Signify Research Ltd., a U.K. consulting firm in healthcare technology.

Regulators in Europe and the U.S. are leading the pace. Each has issued about 60 approvals to date. Asia is making its mark with South Korea and Japan issuing their first approvals recently.

Entrepreneurs are at the forefront of the trend to apply AI to healthcare.

At least 17 companies in NVIDIA’s Inception program, which accelerates startups, have received regulatory approvals. They include some of the first companies in Israel, Japan, South Korea and the U.S. to get regulatory clearance for AI-based medical products. Inception members get access to NVIDIA’s experts, technologies and marketing channels.

“Radiology AI is now ready for purchase,” said Sanjay Parekh, a senior market analyst at Signify Research.

The pipeline promises significant growth over the next few years.

“A year or two ago this technology was still in the research and validation phase. Today, many of the 200+ algorithm developers we track have either submitted or are close to submitting for regulatory approval,” said Parekh.

Startups Lead the Way

Trends in clearances for AI-based products will be a hot topic at the gathering this week of the Radiological Society of North America, Dec. 1-6 in Chicago. The latest approvals span products from startups around the globe that will address afflictions of the brain, heart and bones.

In mid-October, Inception partner LPIXEL Inc. won one of the first two approvals for an AI-based product from the Pharmaceuticals and Medical Devices Agency in Japan. LPIXEL’s product, called EIRL aneurysm, uses deep learning to identify suspected aneurysms using a brain MRI. The startup employs more than 30 NVIDIA GPUs, delivering more accurate results faster than traditional approaches.

In November, Inception partner ImageBiopsy Lab (Vienna) became the first company in Austria to receive 510(k) clearance for an AI product from the U.S. Food and Drug Administration. The Knee Osteoarthritis Labelling Assistant (KOALA) uses deep learning to process within seconds radiological data on knee osteoarthritis, a malady that afflicts 70 million patients worldwide.

In late October, HeartVista (Los Gatos, Calif.) won FDA 510(k) clearance for its One Click MRI acquisition software. The Inception partner’s AI product enables adoption for many patients of non-invasive cardiac MRIs, replacing an existing invasive process.

Regulators in South Korea cleared products from two Inception startups — Lunit and Vuno. They were among the first four companies to get approval to sell AI-based medical products in the country.

In China, a handful of Inception startups are in the pipeline to receive the country’s first class-three approvals needed to let hospitals pay for a product or service. They include companies such as 12Sigma and Shukun that already have class-two clearances.

Healthcare giants are fully participating in the trend, too.

Earlier this month, GE Healthcare recently won clearance for its Deep Learning Image Reconstruction engine that uses AI to improve reading confidence for head, whole body and cardiovascular images. It’s one of several medical imaging apps on GE’s Edison system, powered by NVIDIA GPUs.

Coming to Grips with Big Data

Zebra Medical Vision, in Israel, is among the most experienced AI startups in dealing with global regulators. European regulators approved more than a half dozen of its products, and the FDA has approved three with two more submissions pending.

AI creates new challenges regulators are still working through. “The best way for regulators to understand the quality of the AI software is to understand the quality of the data, so that’s where we put a lot of effort in our submissions,” said Eyal Toledano, co-founder and CTO at Zebra.

The shift to evaluating data has its pitfalls. “Sometimes regulators talk about data used for training, but that’s a distraction,” said Toledano.

“They may get distracted by looking at the test data, sometimes it is difficult to realise the idea that you can train your model on noisy data in large quantities but still generalize well. I really think they should focus on evaluation and test data,” he said.

In addition, it can be hard to make fair comparisons between new products that use deep learning and legacy product that don’t. That’s because until recently products only published performance metrics. They are allowed to keep their data sets hidden as trade secrets while companies submitting new AI products that would like to measure against each other or against other AI algorithms cannot compare apples to apples as done in public challenges

Zebra participated in feedback programs the FDA created to get a better understanding of the issues in AI. The company currently focuses on approvals in the U.S. and Europe because their agencies are seen as leaders with robust processes that other countries are likely to follow.

A Tour of Global Regulators

Breaking new ground, the FDA published in June a 20-page proposal for guidelines on AI-based medical products. It opens the door for the first time to products that improve as they learn.

It suggested products “follow pre-specified performance objectives and change control plans, use a validation process … and include real-world monitoring of performance once the device is on the market,” said FDA Commissioner Scott Gottlieb in an April statement.

AI has “the potential to fundamentally transform the delivery of health care … [with] earlier disease detection, more accurate diagnosis, more targeted therapies and significant improvements in personalized medicine,” he added.

For its part, the European Medicines Agency, Europe’s equivalent of the FDA, released in October 2018 a report on its goals through 2025. It includes plans to set up a dedicated AI test lab to gain insight into ways to support data-driven decisions. The agency is holding a November workshop on the report.

China’s National Medical Products Administration also issued in June technical guidelines for AI-based software products. It set up in April a special unit to set standards for approving the products.

Parekh, of Signify, recommends companies use data sets that are as large as possible for AI products and train algorithms for different types of patients around the world. “An algorithm used in China may not be applicable in the U.S. due to different population demographics,” he said.

Overall, automating medical processes with AI is a dual challenge.

“Quality needs to be not only as good as what a human can do, but in many cases it must be much better,” said Toledano, of Zebra. In addition, “to deliver(ing) value, you can’t just build an algorithm that detects something, it needs to deliver actionable results and many insights for many stakeholders such as both general practitioners and specialists,” he added.

You can see six approved AI healthcare products from Inception startups — including CureMetrix, Subtle Medical and Viz.ai — as well as NVIDIA’s technologies at our booth at the RSNA event.

The post Healthcare Regulators Open the Tap for AI appeared first on The Official NVIDIA Blog.