Skip to main content

Blog

Learn About Our Meetup

5000+ Members

MEETUPS

LEARN, CONNECT, SHARE

Join our meetup, learn, connect, share, and get to know your Toronto AI community. 

JOB POSTINGS

INDEED POSTINGS

Browse through the latest deep learning, ai, machine learning postings from Indeed for the GTA.

CONTACT

CONNECT WITH US

Are you looking to sponsor space, be a speaker, or volunteer, feel free to give us a shout.

Category: Global

Take Your Best Selfie Automatically, with Photobooth on Pixel 3

Taking a good group selfie can be tricky—you need to hover your finger above the shutter, keep everyone’s faces in the frame, look at the camera, make good expressions, try not to shake the camera and hope no one blinks when you finally press the shutter! After building the technology behind automatic photography with Google Clips, we asked ourselves: can we bring some of the magic of this automatic picture experience to the Pixel phone?

With Photobooth, a new shutter-free mode in the Pixel 3 Camera app, it’s now easier to shoot selfies—solo, couples, or even groups—that capture you at your best. Once you enter Photobooth mode and click the shutter button, it will automatically take a photo when the camera is steady and it sees that the subjects have good expressions with their eyes open. And in the newest release of Pixel Camera, we’ve added kiss detection to Photobooth! Kiss a loved one, and the camera will automatically capture it.

Photobooth automatically captures group shots, when everyone in the photo looks their best.

Photobooth joins Top Shot and Portrait mode in a suite of exciting Pixel camera features that enable you to take the best pictures possible. However, unlike Portrait mode, which takes advantage of specialized hardware in the back-facing camera to provide its most accurate results, Photobooth is optimized for the front-facing camera. To build Photobooth, we had to solve for three challenges: how to identify good content for a wide range of user groups; how to time the shutter to capture the best moment; and how to animate a visual element that helps users understand what Photobooth sees and captures.

Models for Understanding Good Content
In developing Photobooth, a main challenge was to determine when there was good content in either a typical selfie, in which the subjects are all looking at the camera, or in a shot that includes people kissing and not necessarily facing the camera. To accomplish this, Photobooth relies on two distinct models to capture good selfies—a model for facial expressions and a model to detect when people kiss.

We worked with photographers to identify five key expressions that should trigger capture: smiles, tongue-out, kissy/duck face, puffy-cheeks, and surprise. We then trained a neural network to classify these expressions. The kiss detection model used by Photobooth is a variation of the Image Content Model (ICM) trained for Google Clips, fine tuned specifically to focus on kissing. Both of these models use MobileNets in order to run efficiently on-device while continuously processing the images at high frame rate. The outputs of the models are used to evaluate the quality of each frame for the shutter control algorithm.

Shutter Control
Once you click the shutter button in Photobooth mode, a basic quality assessment based on the content score from the models above is performed. This first stage is used as a filter that avoids moments that either contain closed eyes, talking, or motion blur, or fail to detect the facial expressions or kissing actions learned by the models. Photobooth temporally analyzes the expression confidence values to detect their presence in the photo, making it robust to variations in the output of machine learning (ML) models. Once the first stage is successfully passed, each frame is subjected to a more fine-grained analysis, which outputs an overall frame score.

The frame score considers both facial expression quality and the kiss score. As the kiss detection model operates on the entire frame, its output can be used directly as a full-frame score value for kissing. The face expressions model outputs a score for each identified expression. Since a variable number of faces may be present in each frame, Photobooth applies an attention model using the detected expressions to iteratively compute an expression quality representation and weight for each face. The weighting is important, for example, to emphasize the expressions in the foreground, rather than the background. The model then calculates a single, global score for the quality of expressions in the frame.

The final image quality score used for triggering the shutter is computed by a weighted combination of the attention based facial expression score and the kiss score. In order to detect the peak quality, the shutter control algorithm maintains a short buffer of observed frames and only saves a shot if its frame score is higher than the frames that come after it in the buffer. The length of the buffer is short enough to give users a sense of real time feedback.

Intelligence Indicator
Since Photobooth uses the front-facing camera, the user can see and interact with the display while taking a photo. Photobooth mode includes a visual indicator, a bar at the top of the screen that grows in size when photo quality scores increase, to help users understand what the ML algorithms see and capture. The length of the bar is divided into four distinct ranges: (1) no faces clearly seen, (2) faces seen but not paying attention to the camera, (3) faces paying attention but not making key expressions, and (4) faces paying attention with key expressions.

In order to make this indicator more interpretable, we forced the bar into these ranges, which prevented the bar scaling from being too rapid. This resulted in smooth variability of the bar length as the quality score changes and improved the utility. When the indicator bar reaches a length representative of a high quality score, the screen flashes to signify that a photo was captured and saved.

Using ML outputs directly as intelligence feedback results in rapid variation (left), whereas specifying explicit ranges creates a smooth signal (right).

Conclusion
We’re excited by the possibilities of automatic photography on camera phones. As computer vision continues to improve, in the future we may generally trust smart cameras to select a great moment to capture. Photobooth is an example of how we can carve out a useful corner of this space—selfies and group selfies of smiles, funny faces, and kisses—and deliver a fun and useful experience.

Acknowledgments
Photobooth was a collaboration of several teams at Google. Key contributors to the project include: Kojo Acquah, Chris Breithaupt, Chun-Te Chu, Geoff Clark, Laura Culp, Aaron Donsbach, Relja Ivanovic, Pooja Jhunjhunwala, Xuhui Jia, Ting Liu, Arjun Narayanan, Eric Penner, Arushan Raj, Divya Tyam, Raviteja Vemulapalli, Julian Walker, Jun Xie, Li Zhang, Andrey Zhmoginov, Yukun Zhu.

Dig In: Startup Tills Satellite Data to Harvest Farm AI

The farm-to-fork movement is getting a taste of AI.

Startup OneSoil cultivates AI to help farmers boost their bounty. The company offers a GPU-enabled platform that turns satellite data into farm analytics for soil and crop conditions.

The Belarus-based company interprets satellite feeds to show how plants reflect different light waves, and it rates the state of plant growth based off this information for plots of land.

OneSoil’s free platform displays how areas of land measure up on the standard known as the NDVI (Normalized Difference Vegetation Index). Farmers can use this vegetation score to spot unhealthy crop areas that need inspection and to plan watering needs and the application of fertilizers.

The field monitoring platform is available as an Android app and on the web.

OneSoil has developed its platform to cover North America, most of western Europe and some of central Europe.  It aims to have coverage of the entire world by year’s end. The satellite data visualizations are updated every three to five days.

Satellite to Sprouts

OneSoil taps into free satellite data from the European Union’s Copernicus Earth observation program. The company manually marked out boundaries on nearly 400,000 fields for training data used on its convolutional neural networks. Now its algorithms can now enable its algorithms to automatically create boundaries from the satellite data.

It processed about 50 terabytes of Sentinel 2 satellite data using NVIDIA GPUs in Microsoft Azure to build out its boundaries of land for the map spanning much of the world.

“With Sentinel images, we need a lot of processing power to analyze those,” said Clement Matyuhov, director of business development at OneSoil.

OneSoil can automatically detect more than 20 different crop types.

Dig the Sensors

OneSoil has developed sensors to work on its platform. Customer can dig a hole and stick in one of its battery-powered sensors that packs a SIM card to start sending data.

The sensors measure air humidity, soil moisture, the temperature of air and soil, and the level of light intensity for the nearby area.

The company has also developed a modem that can transfer data between agricultural equipment and the OneSoil platform over a mobile network.

OneSoil users can enter data, as well. They can make such entries as date of harvest, crop type, average yield, field boundaries and files documenting field work. They can use the app, which tracks location and provides field data, to go examine areas.

Prescriptive Agriculture

On the analytics side, OneSoil Maps makes it easy for farmers to make adjustments on their land. The maps provide a productivity rating of low, medium or high for different areas of the land.

“We can say there is a low productivity zone there, so go check it out. Within one field, the productivity can vary dramatically,” said Matyuhov.

Farmers can use the maps for the vegetation on their land to create prescription maps for fertilizer. These prescription maps, downloadable as a file from app.onesoil.ai, can be uploaded into compatible tractors from John Deere and steering systems from Trimble, allowing tractors to go to the specific GPS coordinates and treat the area as prescribed.

“It’s really an expert assessment for the farmer. The results for the yield can be substantial,” he

Image and credit: Corn harvest with an IHC International combine harvester, Jones County, Iowa, U.S., by Bill Whittaker under Creative Commons license.

The post Dig In: Startup Tills Satellite Data to Harvest Farm AI appeared first on The Official NVIDIA Blog.

Developer at the AWS DeepRacer League Singapore race sets new world record lap time

The AWS DeepRacer League, the world’s first autonomous racing league open to developers of all skill levels held a race in Singapore this week (April 10-11). This was the third of twenty races on the worldwide Summit Circuit.  Following the first two races in Santa Clara, California and Paris, France, excitement was building to see what the Singapore developer community could deliver. And they sure delivered, with the Singapore Champion Juv Chan setting a new world record lap time of 9.090 seconds. In fact, the top seven lap times on the Singapore Summit leaderboard all beat the prior leaderboard top spot (which was 10.43 seconds from Chris Miller in the Santa Clara race). Nice work Singapore!

Juv Chan’s AWS DeepRacer experience started back in November 2018, “I heard about AWS DeepRacer when it was launched at re:Invent 2018 and thought that this is a very interesting way to learn RL,” he said. The moment the Singapore Summit doors opened, Juv was the first racer on the track, setting the pace with a 12.930 second lap using one of the AWS-provided sample reinforcement learning (RL) models.

Getting that hands-on experience at the tracks fueled Juv’s desire to learn more, so he headed to the AWS DeepRacer workshop to dive into how to build his own custom RL model. This marked the beginning of a 24-hour learning and racing extravaganza for him! “I work as an AI developer for my job, but this is my first time exposed to RL. It’s really engaging and addictive,” said Juv.

Juv went home that night determined. He wanted to learn all he could about how to optimize his model further, so he took the AWS DeepRacer: Driven by Reinforcement Learning online training, where he found more tips and tricks on how to climb the leaderboard. Next, Juv put his new knowledge to the test by tweaking hyperparameters and tuning his model, then he trained it for 12 hours to get race-ready.

The AWS DeepRacer Singapore Speedway

The competition was hot on the second day where the rubber really hit the road for Juv and his DeepRacer model. He was first on the track again and immediately took the top spot with a 10.88 second lap. But, he made no assumption that this was enough to win and headed back to his laptop to continue optimizing his model performance. He was soon knocked off as more developers came with their custom models, and lap times in the 9-10 second range were recorded. At one point in the race Juv dropped down to 10th place on the leaderboard. Juv shared the philosophy behind his approach, “Fail fast, learn from mistakes and keep trying.” With that in mind, he came back to race two more times to secure the win. And secure the win he did, with 10 minutes of race time left he threw caution to the wind with the throttle and was victorious with a winning lap time of 09.090 seconds. Congratulations Juv!

Juv won a trip to compete in the AWS DeepRacer League finals at re:Invent 2019 in Las Vegas. I wonder if his 9.090 lap will still be the world record holder then? Developers, this is the time to beat!

The Singapore Summit Winners Podium

Tshiamo Rakgowa, a robotics enthusiast was the first runner-up, with a lap time of 9.420. He was followed closely by Wang Teng Lee, a software engineer with a 9.590 lap (+ 0.17 seconds back).  Both of these gentlemen also tuned and raced their models multiples times, experimenting their way to top spots on the leaderboard. The similarities don’t end there. By coincidence it turns out that all three of the leading racers are connected to a town called Kepong in Malaysia. In fact, it’s Juv and Wang’s childhood home town (they live in Singapore now), and it’s where Tshiamo calls home right now. Congratulations to Tshiamo and Wang, it was a very close race! Don’t forget we still have 17 races to go with 4 in Asia, including Seoul on April 17, Tokyo and Taipei both on June 12, and Hong Kong on June 26.

The Singapore Summit Winners: Juv Chan (center) Singapore Summit Champion, Tshiamo Rakgowa (left) First Runner Up, Wang Teng Lee (right) Second Runner Up

One day, three countries, three live races – Amsterdam, Dubai, and Seoul on April 17

On April 17, the AWS DeepRacer League will hold three AWS Summit races, on three different continents, all on one day. The Summits offer the opportunity to get hands-on with AWS DeepRacer. There will be multiple workshops and hours of live racing. You can register to attend now, and follow the action live on the day at www.deepracerleague.com. Coming soon is the AWS DeepRacer Virtual League. Get your first model ready today by taking the digital training course for reinforcement learning and AWS DeepRacer.

Developers, start your engines! Your journey to becoming a machine learning developer begins with the AWS DeepRacer League.

 


About the Author

Sally Revell is a Principal Product Marketing Manager for AWS DeepLens. She loves to work on innovative products that have the potential to impact people’s lives in a positive way. In her spare time, she loves to do yoga, horseback riding and being outdoors in the beauty of the Pacific Northwest.

 

 

Protagonist adopts Amazon Translate to expand analytics to multilingual content

This is a guest blog post by Bryan Pelley, COO of Protagonist. Protagonist, in their own words “helps organizations communicate more effectively through a data-driven understanding of public discourse.”

Protagonist is a pioneer of the art and science of understanding narratives. We define narratives as the beliefs that an audience holds that are  composed of an interrelated set of concepts, themes, images, and ideas that coalesce into a story. Narratives matter because they reflect the deeply held needs, wants, and desires that weigh heavily, both consciously and unconsciously, on human decision-making. Using Amazon Translate, Protagonist can analyze narratives in languages other than English, which enables us to win global customers.

The Protagonist Narrative Analytics platform uses natural language processing (NLP) and machine learning (ML), guided by human expertise, to surface, measure, and track the narratives that matter to our customers across traditional, social, and other types of online media. The following diagram illustrates our Narrative Analytics solution.

Protagonist has been limited, with a few exceptions, to analyzing English-only content, which we’ve seen as a limitation on the long-term growth of our business. Numerous customers and prospective customers have expressed serious interest in projects involving international narratives.  To create these narratives we would need to work with native language content.

In the past, we were able to do a small number of projects in foreign languages, primarily French and Spanish, thanks to fluent speakers on staff. In these cases, our team would either run the analysis on the content without translation, which limited the range of NLP tools we were able to use. Or, we manually translated a sample set of the overall corpus of content and ran our full suite of tools on the translated set. Sometimes we used a combination of both processes. However, this staff-based manual solution didn’t scale, and it was not efficient. Manually translating a sample of 1,000 media articles took about two weeks. This was a significant delay in providing timely narrative analysis to our customers.

Amazon Translate has changed that for us, enabling us to quickly and effectively translate multilingual content into English for analysis on our narrative platform. We tried a few other machine translation services in the past, but were unhappy with the performance, cost, and, in some cases, the requirement to commit to a long-term contract. Amazon Translate gives us the right combination of speed, accuracy of translation, cost effectiveness, and on-demand flexibility to meet our needs. What used to take two weeks or more to translate now can be done in minutes using Amazon Translate.

We piloted the Amazon Translate service in 2018 on a project for one of our customers, Omidyar Network (ON). One of ON’s major focus areas is property rights. They want to address the fact that a large percentage of the world’s population has limited or nonexistent protections for their property and resources. Naturally, to address a global issue like this, ON wants to understand the narratives that local populations around the world have about their rights, or lack of rights, to land and other property. Using international English language media sources, we were able to help ON gain an understanding of the narratives at play. As the following illustration indicates, analysis of English-only content showed that property rights narratives differed significantly by region, which prompted a desire for a deeper analysis of content in local native languages. For this reason, we saw ON’s property rights work as an ideal place to test Amazon Translate.

Peter Rabley, Venture Partner at Omidyar Network, describes their property rights efforts and the role of Protagonist:

“More than one billion people around the world lack legal rights to their land and property. However, it’s an issue that not enough people pay attention to because it seems too complicated, too complex to wrap your head around. We believe that by simplifying language and telling human interest stories, those in the field can raise greater awareness of the need—sparking innovative solutions, more financing and greater overall engagement. We needed a way to see what the initial conversations around property rights looked like globally in more than one language, and understand how better storytelling may have impacted those conversations over time. This is what Protagonist’s Narrative Analytics allows us to do, helping underscore the value of our investments and unlocking valuable insights for all of us working to advance property rights around the world. Importantly, Protagonist has been able to provide its Narrative Analytics in multiple languages including Spanish.”

As Peter notes, we initially chose to work with Amazon Translate on Spanish language content. We had experience working with Spanish content in the past and access to fluent Spanish speakers, so we could double-check the Amazon Translate outputs and easily identify and troubleshoot issues as they arose. Ultimately, that was not needed because the accuracy of the translations performed by the Amazon Translate service performed was high.

The performance of the Amazon Translate service met or exceeded our expectations. Initially, the API’s rate limit caused some concurrency issues for us because we kept unknowingly exceeding the limit. Since our pilot of the Amazon Translate service, AWS has added a metrics dashboard to the AWS Management Console that makes it easy for us to know if we’re exceeding the rate limit and make adjustments as necessary.

We noticed that AWS has been very thoughtful in keeping Amazon Translate API parameters very flexible, so that when languages are added we can easily integrate the newly supported languages in our data workflows. Specifically, AWS keeps the Python package Boto3 very stable, which allows us to update to the latest version of Boto3 without the worry of breaking existing functionality.

Overall, using Amazon Translate provided several advantages over our previous human-based translation solution. We were able to eliminate the need for time-consuming manual translation. Amazon Translate was able to complete in a matter of minutes translation tasks that would have taken us 60 hours or more in the past. This meant we could expand the amount of content we analyzed with our full suite of tools from a sample of a few hundred articles to tens or hundreds of thousands of articles. We were able to effectively leverage our NLP tools that were trained with only English language corpuses, such as Narrative Richness, cluster analysis, sentiment scoring, and topic modeling. The ability to accurately analyze large amounts of foreign language content using our English-language-trained NLP tools on the translated materials represents a significant cost and time savings for us.

But perhaps most importantly, Amazon Translate provides cost-effective access to a range of languages that we haven’t been able to work with before, including Arabic, Chinese, and Russian. This opens up a wide range of customers and opportunities that we couldn’t have supported before. We’re in active discussions with several large customers on global narrative projects that would make extensive use of the Amazon Translate capabilities. We’re excited to continue working with Amazon Translate and exploring the new opportunities that the service brings.

 

Get started with the AWS Live Streaming with Automated Multi-Language Subtitling solution

Live Streaming with Automated Multi-Language Subtitling is a solution that automatically generates multi-language subtitles for live streaming video content in real time. You can use this solution as-is, customize the solution to meet your specific use case, or work with AWS Partner Network (APN) partners to implement an end-to-end subtitling workflow.

Based on the Live Streaming on AWS solution, the implementation adds machine learning services Amazon Transcribe and Amazon Translate into the mix. The solution enables the last-mile addition of automatically generated subtitles to live over the top (OTT) channels without having to hire a dedicated transcriptionist, which could be too costly to make subtitles available in general. The solution is available as open source for anyone who wants to expand the basic architecture, adding custom features to fit the solution into their workflow. The GitHub repository can be found here.

Additional AWS Solutions offerings are available on the AWS Solutions webpage, where customers can browse solutions by product category or industry to find AWS-vetted, automated, and turnkey reference implementations that address specific business needs.

Note: The solution described in this blog post uses Amazon Transcribe Streaming, AWS MediaLive, and AWS MediaPackage, which are currently available only in specific AWS Regions. Therefore, you must launch this solution in an AWS Region where all of these services are available. For the most current AWS service availability by Region, see AWS service offerings by region.

Step 1: Deploy the Live Streaming with Automated Multi-Language Subtitling solution

Sign into the AWS Management Console and then head over to the Live Streaming with Automated Multi-Language Subtitling Solution page. Choose Launch solution in the AWS Console.

Step 2: Launch the AWS CloudFormation template

The stack can also be launched with the Launch Solution in the documentation guide.

Step 3: On the Select Template page, choose Next

Step 4: Input information on the Specify Details page

  1. Choose a name for your stack.
  2. Choose what input format you want to use.
  3. If you are using HLS pull put in your input URLs. Example: https://s3.amazonaws.com/yourbucketname/index.m3u8
  4. Choose the languages you want as subtitles. For example if you want English, Spanish, and German you would enter: en, es, de.

The supported output subtitle languages are listed here. For information on the inputs see the documentation guide.

Step 5: On the Options page, choose Next

Choose the Next button on the options page.

Then, check that you accept that AWS CloudFormation will create IAM resources and choose Create. 

Note that this CloudFormation takes about 20 minutes to deploy.

Step 6: Solution should show deployed now

You should see CREATE_COMPLETE in the status area.

The screenshot of the solution deployed page should say CREATE_COMPLETE under the status area for the solution.

After waiting a minute for the AWS MediaLive channel to start you can copy and paste the HLSEndpoint URL ending in m3u8 into Safari or an online test player, such as Video.JS.

I took the HLS stream output ending in m3u8 and pasted it into my Safari browser search bar. The subtitle selector on the bottom right allows a user to select different languages for the subtitles.

Conclusion

We have shown you how easy it is to set up your Live Stream with automatically generated subtitles from Amazon Transcribe. For more information about AWS Media Services or this solution follow these links:


About the Author

Eddie Goynes is a Technical Marketing Engineer for AWS Elemental. He is an AWS Cloud and Live Video Streaming technical expert.

 

 

 

Medical Imaging Startup Uses AI to Classify Conditions from Sinus and Brain Scans

Radiologists are tasked with diagnosing some of the most serious medical conditions — but their workloads are becoming increasingly demanding as the volume of imaging studies such as CT and MRI has steadily gone up.

Houston-based InformAI is stepping in to help reduce fatigue and stress for radiologists by building deep learning tools that can help them analyze medical scans faster.

“We wanted to build diagnostic-assist tools for clinicians to speed up information workflow and decision-making at the point of care to benefit patients,” said InformAI CEO Jim Havelka.

InformAI trains its deep learning image classifiers and patient outcome predictors on NVIDIA V100 GPUs through the Microsoft Azure cloud platform and with an onsite NVIDIA DGX Station. The startup worked with data science consulting firm SFL Scientific to develop a convolutional neural network-based deep learning technology stack using top technology resources.

In less than 30 seconds, InformAI’s image classifier scans for 20 sinus conditions and flags which ones might be present in a patient’s 3D CT scan. This AI tool has also formed the basis for other image classification applications that analyze 3D scans of soft tissue — including detecting common brain cancers from MRI scans.

AI Spots Sinus Conditions

Figuring out the structure of an individual’s sinuses is harder than it sounds. Each person’s sinus cavities look different, making it challenging for AI to determine if an infection or abnormal mass is present in the eight major sinus cavities and passageways that connect them.

Doctors perform around 700,000 sinus procedures each year in the United States. Using AI to speed up the diagnostics workflow can save on healthcare costs and shorten the time it takes to begin treatment.

InformAI and its healthcare partners built a training dataset was built consisting of approximately 6 million images from 20,000 patient studies. The scans were labeled by a team of radiologists and medical residents who worked with the company on the project.

Radiologists using the startup’s platform can examine and analyze 3D sinus CT scans while the predictor neural network is running. In under a minute, the AI results pop up for 20 sinus medical conditions, which the doctors can then use to assist in their diagnosis and treatment planning process.

InformAI is deploying the sinus classifier this spring at a hospital and several clinics to test its effectiveness as an assist tool for radiologists and ear, nose and throat physicians. The team is also going through the regulatory process required for the AI to be certified as a direct diagnostic tool.

A Neural Network for Neurological Disorders

In general terms, the sinus classification neural network extracts 3D segments from a CT scan to analyze whether a particular disease or set of diseases is present in those image segments, Havelka said. Since the network was trained on such a large medical dataset, it can be repurposed using transfer learning to solve image classification problems for a broad range of soft tissue medical applications.

The startup is doing just that. Using transfer learning, the team trained a neural network to detect disease from another kind of soft tissue: the brain.

When a tumor or lesion is identified in the brain, “it can be life-and-death for patients,” said Havelka. “Early detection and classification are critical in providing the best treatment options and outcome for patients.”

But different brain tumors and lesions can look alike, and can also resemble other neurological disorders with different treatments. As a result of this classification complexity, a patient’s treatment plan can evolve over time.

When radiologists are unable to make a conclusive diagnosis from a brain MRI scan, physicians turn to invasive brain biopsies to obtain additional information. An AI tool that can assist radiologists in making an earlier and more certain diagnosis could reduce the number of required biopsies.

Using a 3D CNN, InformAI is developing a tool that analyzes brain MRI scans to detect whether a tumor or lesion is present, and can classify an abnormal scan as one of four conditions: glioblastoma, metastatic brain tumor, multiple sclerosis or lymphoma.

The deep learning model for brain cancer detection, which is still under development, was initially trained on around 100,000 image scans from 1,000 patient studies.

Founded in 2017, InformAI is a member of the NVIDIA Inception virtual accelerator program. To learn more about the company’s work, read this recent white paper.

The post Medical Imaging Startup Uses AI to Classify Conditions from Sinus and Brain Scans appeared first on The Official NVIDIA Blog.

How AI Is Transforming Healthcare

Healthcare is a multitrillion-dollar global industry, growing each year as average life expectancy rises — and with nearly unlimited facets and sub-specialties.

For medical professionals, new technologies can change the way they work, enable more accurate diagnoses and improve care. For patients, healthcare innovations lessen suffering and save lives.

Deep learning can be implemented at every stage of healthcare, creating tools that doctors and patients can take advantage of to raise the standard of care and quality of life.

How AI Is Changing Patient Care

Providing patient care is a series of critical choices, from decisions made on a 911 call to the recommendations a primary physician makes at an annual physical. The challenge is getting the right treatments to patients as fast and efficiently as possible.

Nearly half the countries and territories in the world have less than one physician per 1,000 people, a third of the threshold value to deliver quality healthcare, according to a 2018 study in The Lancet. Meanwhile, as healthcare data goes digital, the amount of information medical providers collect and refer to is growing.

In intensive care units, these factors come together in a perfect storm — patients who need round-the-clock attention; large, continuous data feeds to interpret; and a crucial need for fast, accurate decisions.

Researchers at MIT’s Computer Science and Artificial Intelligence Lab developed a deep learning tool called ICU Intervene, which uses hourly vital sign measurements to predict eight hours in advance whether patients will need treatments to help them breathe, require blood transfusions or need interventions to improve heart function.

Corti, a Denmark-based startup, is stepping in at another time-sensitive interaction: phone calls with emergency services. The company is using an NVIDIA Jetson TX2 module to analyze emergency call audio and help dispatchers identify cardiac arrest cases in under a minute.

LexiconAI, a member of the NVIDIA Inception program, is helping doctors spend more time with their patients every day. The startup built a mobile app that uses speech recognition to capture medical information from doctor-patient conversations — making it possible to automatically fill in electronic health records.

How AI Is Changing Pathology

Just as millions of medical scans are taken each year, so too are hundreds of millions of tissue biopsies. While pathologists have long used physical slides to analyze specimens and make diagnoses, these slides are increasingly being scanned to create digital pathology datasets.

Inception startup Proscia uses deep learning to analyze these digital slides, scoring over 99 percent accuracy for classifying three common skin pathologies. Using AI can help standardize diagnoses, which is important. Depending on the type and stage of disease, two pathologists looking at the same tissue may disagree on a diagnosis more than half the time.

SigTuple, another Inception startup, developed an AI microscope to analyze blood and bodily fluids. The microscope scans physical slides under a lens and uses GPU-accelerated deep learning to analyze the digital images either on SigTuple’s AI platform in the cloud or on the microscope itself.

Compared to scanners that automatically convert glass slides to digital images and interpret the results, SigTuple’s microscope does this at a fraction of the cost. The company hopes its tool will address the global pathologist shortage, a crucial problem in many countries.

How AI Is Changing Predictive Health

A host of AI tools are being developed to detect risk factors for diseases months before symptoms appear. These will help doctors make earlier diagnoses, conduct longevity studies or take preventative action. Taking advantage of the ability of deep learning models to spot patterns in large datasets, these tools may extract insights from electronic health records, physical features or genetic information.

One mobile app, Face2Gene, uses facial recognition and AI to identify about 50 known genetic syndromes from photos of patients’ faces. It’s used by around 70 percent of geneticists worldwide and could help cut down the time it takes to get an accurate diagnosis.

Another deep learning tool, developed by researchers at NYU, analyzes lab tests, X-rays and doctors notes to predict ailments like heart failure, severe kidney disease and liver problems three months faster than traditional methods.

Using AI and a wide range of electronic health records helped the researchers draw new connections among hundreds of health measurements that could predict diseases like diabetes.

How AI Is Enabling Healthcare Apps

Healthcare doesn’t start and end at the doctor’s office. And with wearables, smartphones and IoT devices, there’s no shortage of devices to monitor health from anywhere.

A service called SpiroCall, for example, makes it possible for patients to check lung function by breathing into a smartphone, either by dialing a toll-free number or recording a sound file on an app. The data is sent to a central server, which uses a deep learning model to assess lung health.

For athletes at risk of suffering concussions on the playing field, an AI-powered app is using a smartphone camera to analyze how an athlete’s pupils respond to light, a metric medical professionals use to diagnose brain injury.

And in the realm of mental health, Canadian startup Aifred Health is using GPU-accelerated deep learning to better tailor depression treatments to individual patients. Using data on a patient’s symptoms, demographics and medical test results, the neural network helps doctors as they prescribe treatments.

How AI Is Enabling Devices for People with Disabilities

A billion people around the world experience some form of disability. AI-powered technology can provide some of them with a greater level of independence, making it easier to perform daily tasks or get around.

Aira, a member of the Inception program, has created an AI platform that connects to smart glasses, helping people with impaired vision with tasks like reading labels on medication bottles. And a professor at Ohio State University is using GPUs and deep learning to create a hearing aid that can bump the volume of speech while filtering out background noise.

Researchers at OSU and Battelle, a nonprofit research organization, are developing a brain-computer interface powered by neural networks that can read thoughts and restore movement to paralyzed limbs.

And a team at Georgia Tech developed an AI prosthetic hand that helped jazz musician Jason Barnes play piano for the first time in five years. The prosthesis uses electromyogram sensors to recognize muscle movement and allows for individual finger control.

See the NVIDIA healthcare page for more.

Main image licensed from iStock.

The post How AI Is Transforming Healthcare appeared first on The Official NVIDIA Blog.

NVIDIA CEO Ties AI-Driven Medical Advances to Data-Driven Leaps in Every Industry

Radiology. Autonomous vehicles. Supercomputing. The changes sweeping through all these fields are closely related. Just ask NVIDIA CEO Jensen Huang.

Speaking in Boston at the World Medical Innovation Forum to more than 1,800 of the world’s top medical professionals, Huang tied Monday’s news — that NVIDIA is collaborating with the American College of Radiology to bring AI to thousands of hospitals and imaging centers — to the changes sweeping through fields as diverse as autonomous vehicles and scientific research.

In a conversation with Keith Dryer, vice chairman of radiology at Massachusetts General Hospital, Huang asserted that data science — driven by a torrent of data, new algorithms and advances in computing power — is becoming a fourth pillar of scientific discovery, alongside theoretical work, experimentation and simulation.

Putting data science to work, however, will require enterprises of all kinds to learn how to handle data in new ways. In the case of radiology, the privacy of the data is too important, and the expertise is local,  Huang told the audience. “You want to put computing at the edge,” he said.

As a result, the collaboration between NVIDIA and the American College of Radiology promises to enable thousands of radiologists nationwide to use AI for diagnostic radiology in their own facilities, using their own data, to meet their own clinical needs.

Huang began the conversation by noting that the Turing Award, “the Nobel Prize of computing,” had just been given to the three researchers who kicked off today’s AI boom: Yoshua Bengio, Geoffrey Hinton and Yann LeCunn.

“The takeaway from that is that this is probably not a fad, that deep learning and this data-driven approach where software and the computer is writing software by itself, that this form of AI is going to have a profound impact,” Huang said.

Huang drew parallels between radiology and other industries putting AI to work, such as automotive, where Huang sees an enormous need for computing power in autonomous vehicles that can put multiple intelligences to work, in real time, as they travel through the world.

Similarly, in medicine, putting one — or more — AI models to work will only enhance the capabilities of the humans guiding these models.

These models can also guide those doing cutting-edge work at the frontiers of science, Huang said, citing Monday’s announcement that the Accelerating Therapeutics for Opportunities in Medicine, or ATOM, consortium will collaborate with NVIDIA to scale ATOM’s AI-driven drug discovery program.

The big idea: to pair data science with more traditional scientific methods, using neural networks to help “filter” through the large combination of possible molecules to decide which ones to simulate to find candidates for in vitro testing, Huang explained

Software Is automation, AI Is the Automation of Automation

Huang sees such techniques being used in all fields of human endeavor — from science to front-line healthcare and even to running a technology company. As part of that process, NVIDIA has built one of the world’s largest supercomputers, SATURNV, to support its own efforts to train

AI models with a broad array of capabilities. “We use this for designing chips, for improving our systems, for computer graphics,” Huang said.

Such techniques promise to revolutionize every field of human endeavor, Huang said, asserting that AI is “software that writes software,” and that software’s “fundamental purpose is automation.”

“AI therefore is the automation of automation,” Huang said. “And if we can harness the automation of automation, imagine what good we could do.”

 

 

The post NVIDIA CEO Ties AI-Driven Medical Advances to Data-Driven Leaps in Every Industry appeared first on The Official NVIDIA Blog.