Have you ever wondered what it takes to produce the complex imagery in films like Star Wars or Transformers? The man behind the magic, Colie Wertz, is here to explain.
Wertz is a conceptual artist and modeler who works on film, television and video games. He sat down with AI Podcast host Noah Kravitz to explain his specialty in hard modeling, in which he produces digital models of objects with hard surfaces like vehicles, robots and computers.
To make these images, Wertz has taken to using AI art tools such as GauGAN, a real-time painting web app that allows users to create realistic landscapes using generative adversarial networks.
Rather than use GauGAN in the traditional manner, Wertz makes the tools “trick themselves” by putting a mountain in the sky, or snow falling at the bottom of the page, to create a unique image. Then he incorporates his signature spaceships into the scene.
Artist Colie Wertz uses the GauGAN landscape to inspire some of his ship designs.
Wertz appreciates how easily GauGAN builds a background. He says, “Coming from the hard surface world, that’s the kind of stuff that’s kind of always been a curveball for me, like matte painting and background composition.” Now, Wertz is able to focus on the ship and how to “integrate it into a background.”
For some of his creations, Wertz uses the GauGAN landscape to inspire his ship designs. He views AI art as a “creative partner” rather than a replacement for more traditional forms of art.
Wertz’s artistic career kickstarted after he left an architectural design firm in South Carolina and moved to Los Angeles to develop his digital art skills. There, he entered one of his spaceship models created with Photoshop into a contest put on by visual effects production company Electric Image.
Colie Wertz views AI art as a “creative partner” rather than a replacement for more traditional forms of art.
Caption: Wertz views AI art as a “creative partner” rather than a replacement for more traditional forms of art.
The judges were impressed, and Wertz ended up with a job at Industrial Light & Magic, a visual effects company founded by George Lucas. Wertz’s first job was working on the rerelease of Return of the Jedi, building digital models for matte painters.
For listeners curious about Wertz’s current work, they can look at his portfolio, visit his website or follow him on Instagram.
Help Make the AI Podcast Better
Have a few minutes to spare? Fill out this short listener survey. Your answers will help us make a better podcast.
Some teens might feel the pressure of having an older sibling as accomplished as 19-year-old Kavya Kopparapu, a Harvard sophomore named last year a U.S. Presidential Scholar and one of TIME’s 25 Most Influential Teens.
But high school senior Neeyanth Kopparapu, 17, is holding his own. He’s making his own mark with PDGAN, a deep learning model to help medical professionals diagnose Parkinson’s disease from MRI scans.
Together, the D.C.-area siblings have collaborated on a deep learning tool to diagnose diabetes-induced blindness in regions with limited healthcare access. They also founded in 2015 GirlsComputingLeague, a nonprofit working to improve diversity in computer science.
A GAN on a Mission
Parkinson’s disease — a neurodegenerative disorder causing tremors, stiffness and problems moving and balancing — affects more than 10 million people worldwide. When Kopparapu’s grandfather was diagnosed with it two years ago, his family was dismayed to find that it was too late to avail of many existing treatments for the symptoms.
“Like most Parkinson’s patients, he was diagnosed at a stage where a lot of the treatments out there become ineffective,” he said. “Originally, we thought it was a fluke that he was diagnosed later. But upon further research into treatments, we realized it’s not a fluke, it’s a problem with the system.”
It was a problem Kopparapu thought AI could help solve. Using an annotated dataset of around 1,000 brain MRI scans from the University of Southern California, he began training a neural network to spot signs of Parkinson’s. Due to the limited size of the dataset, the trained model’s accuracy hovered at around 90 percent.
That’s already a significant improvement over the current clinical accuracy of diagnosing Parkinson’s from brain scans. But Kopparapu wasn’t settling for an A-minus.
“The only way that I was going to be able to improve the model’s performance was by increasing the number of data points that I had,” he said. “I heard about GANs at the time and thought — what if I was able to use this tool to synthetically augment the dataset?”
Using generative adversarial networks, or GANs, helped Kopparapu boost the AI model’s accuracy to 96.5 percent, with an accuracy of around 98 percent on scans from later-stage patients. The deep learning networks were trained using an NVIDIA Tesla GPU on the Amazon Web Services cloud platform.
Parkinson’s is typically diagnosed when a patient starts showing physical symptoms, with scans taken as just one part of the diagnostic process. Kopparapu hopes that, once clinically validated, tools like PDGAN could be used to help confirm patient diagnoses earlier, giving them more options for treatments.
Learning Computer Science By Heart
Like many software engineers, Kopparapu can trace his passion for computer science back to an early love of video games. He once hoped to create his own version of the popular Pokémon series.
But gaming, he says, is “something I don’t have time for anymore.”
Instead, Kopparapu is focused on his passion for computer science and math. He started learning to code in middle school using online resources, and later took an AI class as a freshman at Thomas Jefferson High School for Science and Technology.
Kopparapu is most interested about the math that underlies AI (having taken multivariable calculus and linear algebra as a sophomore) and is considering a college major in applied math or computer science.
While he’s so far worked with NVIDIA GPUs in the cloud and in a server at his high school, Kopparapu set eyes on his dream AI system when he watched NVIDIA founder and CEO Jensen Huang unveil it live at GTC.
“If there was a DGX-2 system I could tap into,” he said, “that’d be the coolest thing in the world.”
Like Sister, Like Brother
Kopparapu owes his interest in AI-driven healthcare applications to having his older sister as a research and entrepreneurial partner.
“She was interested in biology and healthcare first, and picked up computer science when we started working together,” he said. “With me it was the other way around — I picked up healthcare from her.”
The siblings’ first AI tool, a smartphone app dubbed Eyeagnosis, has been tested as a screening tool for diabetic retinopathy by the Aditya Jyot Eye Hospital in Mumbai, India. And the nonprofit they started has grown to around two dozen student volunteers organizing events and workshops for fellow high schoolers who come from underrepresented communities.
Although the pair have a number of successful ventures under their belts already, school comes first.
“A lot of people have suggested we do a sibling-founded startup,” Kopparapu said, “but I really want to at least finish my undergrad before I look at that possibility.”
In the wake of Amazon’s boffo Alexa voice debut, a University of Michigan team published pioneering research on building conversational AI, attracting a wave of customer interest.
Jason Mars, a professor advising them, suggested they form a startup. And with that, Ann Arbor, Michigan-based conversational AI startup Clinc was born.
Clinc’s conversational AI platform enables customers to build voice applications — like in-car voice features, fast-food restaurant order services or personal banking assistants.
“What really tipped the scales to start something even bigger was that the industry was reaching out saying we want to commercialize it,” said Johann Hauswald, chief product officer and co-founder at Clinc, who recounted the company’s start five years ago.
Fast forward to today, and that’s turned into a big opportunity: Clinc has attracted a flood of customers and revenue.
The startup’s financial customers include Barclays, US Bank, S&P Global, and Turkey’s Ishbank, which taps Clinc to offer a personal finance assistant, dubbed Maxi, to 6 million users.
Large financial institutions are well-aware of Clinc, which has been “dominating the space,” said Mars, Clinc’s CEO, speaking on stage last year at TechCrunch Disrupt.
The company’s roster of customers doesn’t stop at finance. Clinc’s AI platform — built to handle voice assistants for any stage startup to a Fortune 500 company — can provide services for call centers, drive-thru restaurants, in-car systems, gaming and healthcare applications.
Breakthrough performance from NVIDIA’s AI platform has helped enable Clinc to push the boundaries on conversational AI to “deliver revolutionary services,” according to Mars.
Conversational AI Boom
To be sure, Clinc’s application-focused research stands out. It’s a mix of academic AI and how-to information for solving specific industry problems, which has attracted interest from some of the customers it’s landed to date.
Clinc raised a $52 million Series B round of funding earlier this year to help scale up to meet its customer demand.
Research firm Gartner forecasts that 15 percent of all customer service interactions will be handled by AI in 2021, a 400 percent jump from 2017.
Clinc: Talking Model Research
Academic discoveries are common launch pads for startups. But Clinc’s team at the University of Michigan built working models and provided the details for companies to develop their own voice models as well as spelled out the data center requirements to deliver the compute resources.
Clinc offers research, outlined in a published paper, on its Sirius voice personal assistant and an in-car assistant that it worked on with Ford aimed at applications for automakers.
Today it offers conversational AI in 80 languages and has production deployments on three continents.
Hardware to the Core
The Clinc team several years ago ran a cost-benefit analysis, finding that NVIDIA GPUs were the right choice for accelerated computing in the data center.
“GPUs were a big story in our research lab at the university,” said Hauswald.
Often times, complex applications require a multitude of complicated algorithms and optimizations to create the best performance possible, which is also the most compute intensive, he said.
“We want to be able to train our models in a way that doesn’t take days to train or then our customers are unable to iterate on the quality of them,” said Hauswald.
Leading into TwitchCon — the world’s top gathering of livestreamers — we’re announcing the RTX Broadcast Engine, a new set of RTX-accelerated software development kits that use the AI capabilities of GeForce RTX GPUs to transform livestreams.
Powered by dedicated AI processors called Tensor Cores on RTX GPUs, the new SDKs enable virtual greenscreens, style filters and augmented reality effects — the kind of techniques used by major broadcast networks — all using AI and without the need for special equipment.
Livestreaming of video games has become a cultural phenomenon. Over 750 million people around the world tune in to watch people play video games. TwitchCon is where this global movement comes together. More than 50,000 streamers and fans will converge in San Diego this weekend to meet their favorite gamers and learn about the future of livestreaming.
RTX Brings AI to Livestreaming
NVIDIA GPUs are already the most popular choice to power the PC games played by streamers. They’re also used to encode and stream video to platforms such as Twitch, YouTube, Mixer, Huya and Douyu.
With the RTX Broadcast Engine’s AI-powered capabilities, NVIDIA is announcing a new way that RTX GPUs can enable more immersive livestreams — all without special cameras or physical props.
The new SDKs include:
RTX Greenscreen, to deliver real-time background removal of a webcam feed, so only your face and body show up on the livestream. The RTX Greenscreen AI model understands which part of an image is human and which is background, so gamers get the benefits of a greenscreen without needing to buy one.
RTX AR, which can detect faces, track facial features such as eyes and mouth, and even model the surface of a face, enabling real-time augmented reality effects using a standard web camera. Developers can use it to create fun, engaging AR effects, such as overlaying 3D content on a face or allowing a person to control 3D characters with their face.
RTX Style Filters, which use an AI technique called style transfer to transform the look and feel of a webcam feed based on the style of another image. With the press of a hotkey, you can style your video feed with your favorite painting or game art.
NVIDIA and OBS Bringing RTX Greenscreen to Gamers
In addition, we’re working with OBS, one of the leading livestreaming applications, to integrate RTX Greenscreen. With it, livestreamers will be able to remove their background environment or instantly teleport themselves anywhere — in this world or in virtual ones. This feature will be showcased at TwitchCon for the first time and available in the coming months.
“NVIDIA has been at the top of my list when it comes to streaming and recording equipment. I’m continually impressed with what they’re doing,” said Hugh Bailey, author, OBS. “And their technology is impressive with RTX features like RTX Greenscreen.”
The RTX Broadcast Engine will enable streaming applications throughout the ecosystem to create immersive tools and effects for broadcasters to engage audiences and drive viewership.
“The new RTX Broadcast Engine is an exciting advancement that will allow developers in our app store to create powerful new tools for streamers with NVIDIA RTX GPUs,” said Ali Moiz, CEO of Streamlabs. “We’re thrilled to continue working with NVIDIA as they introduce new features to the Streamlabs developer community, and look forward to implementing this new technology.”
“We have collaborated with NVIDIA over the years on many projects and the introduction of the NVIDIA RTX Broadcast Engine is by far the most exciting,” said Miguel Molina, director of developer relations at XSplit. “For the XSplit team, we are excited to integrate these new tools into our suite of apps, enabling our users to create better content by maximizing the potential of NVIDIA GeForce RTX.”
In addition to RTX Broadcast Engine, leading applications such as OBS, XSplit, Huya, Douya and Streamlabs have deployed the NVIDIA Video Codec SDK for fast, high-quality streaming. Three new integrations made their debut this month:
Twitch Studio, a new, easy-to-use application for new livestreamers currently in beta, has integrated the Video Codec SDK to enable high-quality livestreaming.
Discord, the world’s leading gaming chat application, just released a new group broadcasting feature called “Go Live,” which uses NVIDIA GPUs and the Video Codec SDK to accelerate broadcasting games in Discord.
Elgato is one of the world’s leading manufacturers of video capture cards for gaming. It recently integrated the Video Codec SDK into the software of its new 4K60 Pro MK.2 capture card for recording 4K at 60fps video in High Dynamic Range.
Developers can learn more about the RTX Broadcast Engine and apply for early access at developer.nvidia.com/broadcastengine. Or stop by the OBS booth at TwitchCon, booth 1823, where we’ll be showing off RTX Greenscreen in OBS, new RTX Studio laptops and upcoming RTX games.
In the future imagined by Pinar Yanardag, a postdoctoral research associate at MIT Media Lab, AI will collaborate with humans, not replace them.
This is the concept behind her project, “How to Generate (Almost) Anything,” which she created with other students from the MIT Media Lab and professionals in the Boston area.
Yanardag sat down with AI Podcast host Noah Kravitz to talk about this project, along with her other new creations.
How to Generate (Almost) Anything tackles weekly projects that integrate human and AI creativity. “So these are artists and artisans from all walks of life. Sometimes, these people have no experience in AI, sometimes they’re a bit up to date,” Yanardag says.
Mystic PizzAI — Reinventing Gourmet Food with GPUs
The team starts with choosing something to generate — one of their first projects was pizza. Then they train a network using data they’ve collected. For their pizza project, they fed it a multitude of recipes. AI then generates its own content.
Yanardag and her colleagues find a human collaborator who evaluates the AI-generated idea and tweaks it. Their system produced a recipe for shrimp and jam pizza, a seemingly alarming combination.
But their collaborator, the chef of Crush Pizza in Boston, augmented with recipe with arugula. The result was so delicious that he’s considering adding it to his regular menu.
She’s proving that humans should be excited rather than fearful of job automation. “These are the tasks we shouldn’t have to do in the first place,” Yanardag says. Humans can now “focus on more important skills — our emotions, our creativity, our empathy.”
That sentiment also helped Yanardag start the world’s first AI fashion brand, Coven.ai. She and cofounder Emily Salvador, also from the MIT Media Lab, create dresses based on AI-generated designs.
The AI component invents outfits humans might not think of — in Coven.ai’s reimagining of the classic Little Black Dress, one arm of the dress is a bell sleeve, and the other is straight.
Yanardag and Salvador are releasing new dresses on their site, but they’re also designing a platform in which the public can interact with their AI system.
Caption: Success with a dress: Coven.ai shows how AI can generate appealing fashion.
Caption: Success with a dress: Coven.ai shows how AI can generate appealing fashion.
“The idea is, you can just generate new designs on your own using our tool, and you can also finetune some of the details in the dress, like different colors or different styles or different textures,” Yanardag says. Users could send that design to a tailor, who would make the dress for them.
For Yanardag, the next step is the democratization of AI. She points out that right now, a powerful GPU is required to create these inventions. But by lowering the entry barrier, we can “empower people to create beautiful things.”
Help Make the AI Podcast Better
Have a few minutes to spare? Fill out this short listener survey. Your answers will help us make a better podcast.
Despite being treatable, tuberculosis kills 1.6 million people every year.
This is because TB treatment is time- and cost-intensive, requiring extensive patient monitoring.
In developing countries, where the disease is most deadly, monitoring involves a form of testing that has been used for hundreds of years. Clinicians study samples of lung fluid (called sputum) under a microscope and manually count the number of TB bacteria present, which sometimes reach into the hundreds.
This method may be cheaper than other available tests, but it’s only accurate 50 percent of the time.
Cambridge Consultants, a U.K.-based consultancy, has set out to investigate whether an AI-powered monitoring system could provide a feasible alternative for keeping tabs on this killer.
The result is BacillAi, a system that uses an AI-powered smartphone app and a standard-grade microscope to capture and analyze samples of sputum.
“With BacillAi, we wanted to tackle two main questions,” explained Richard Hammond, technology director of the Medical Technologies Division at Cambridge Consultants. “Can AI improve a labor-intensive, difficult process in healthcare diagnostics? And how could you go about making it available to those who need it most, even in the most remote and low-resource areas?”
Putting Manual Processes Under the Microscope
The current process for monitoring TB patients is inefficient and ineffective. Medical professionals review any number of patient samples a day, identifying and counting every single cell. This can take up to 40 minutes per case.
And the difficulty doesn’t stop there. Stains used to distinguish cells in the lung fluid can vary in strength between samples, and adjusting a microscope’s optical focus can alter colors.
The final BacillAi concept consists of a standard low-cost microscope, modified with a mount for a smartphone, and an AI app.
Clinicians monitoring TB under these conditions face both mental and physical strain. With such a high risk of human error, patients often receive poor-quality results that arrive too late for them to start vital treatment.
To tackle this conundrum, Cambridge Consultants trained a deep learning system using data gathered from cultured surrogate bacteria and artificial sputum.
The final BacillAi concept consists of a standard low-cost microscope, modified with a mount for a smartphone, and an app with the CNN at its heart.
A product like BacillAi could help clinicians determine the state of a patient’s health faster and more consistently than is currently possible. Patients would also have improved chances of fighting the disease.
Solving Challenges at Scale
A multidisciplinary team worked on developing BacillAi in Cambridge Consultants’ purpose-built deep learning research facility, which is powered by ONTAP AI. The space is designed specifically for discovering, developing and testing machine learning approaches in a secure environment.
The same research facility also developed Aficionado, an AI music classifier, Vincent, which turns your squiggles into art, and SharpWave, a tool that creates clear, undistorted views of the real world from a damaged or obscured moving image.
Discover Cambridge Consultants’ innovative approaches for yourself at The AI Summit, in San Francisco, Sept. 25-26.
Bananas are the world’s favorite fruit. If you don’t count tomatoes as fruit. (And really, who does?)
But banana crops around the globe are afflicted by diseases and pests that threaten the livelihoods of small-scale farmers, most of whom rely on just one or two cash crops and lack the resources commercial farms use to monitor the health of their plants.
A new AI app aims to help resource-poor farmers more accurately identify and treat banana diseases, improving their crop yields. Called Tumaini, meaning “hope” in Swahili, the app could also give nonprofits and governments more information and tools to control disease outbreaks in bananas and other crops.
Trained using NVIDIA GPU technology, the convolutional neural networks behind Tumaini achieve around 90 percent accuracy in detecting five common banana diseases and one pest.
It’s easy for farmers diagnosing problems in their banana crop to confuse the symptoms of fungal, bacterial and viral diseases. Many cause similar patterns of yellow leaf spots and decay. Misinterpreting the signs can waste precious time and resources.
“Especially in developing countries, smallholder farmers have minimal resources to spend on fertilizer and treatments,” said Michael Selvaraj, who led the project at the International Center for Tropical Agriculture. “If you’re spraying fungicide over plants with a bacterial disease, you’re wasting your money.”
Based in Cali, Colombia, the nonprofit organization is a research center of the international agricultural innovation network CGIAR.
Scientists from the nonprofit Biodiversity International helped the team hand-label a dataset of 20,000 banana plant images collected from banana farms in southern India, Uganda, Burundi, Benin and the Democratic Republic of Congo. The team used field images for training to improve the AI’s ability to read low-quality images with background elements such as neighboring plants or leaves.
Bananas are a challenging crop to analyze for disease, because symptoms can appear in several different parts of the plant — from the fruit down to the trunk, known as the pseudostem.
AI Goes Bananas: The Tumaini app analyzes different areas of the banana plant to diagnose crop disease. Image courtesy of the International Center for Tropical Agriculture.
“It may be that the leaf looks very healthy,” Selvaraj said, “but when you cut open the pseudostem you can find the disease.”
The dataset was used to train six different neural networks, each analyzing images from a different part of the banana plant. This way, farmers using the Tumaini app can take pictures of multiple parts of a diseased crop, like the leaf and the fruit, to double-check the results of the AI model.
After identifying the banana disease, Tumaini provides users with treatment guidance. To better serve farmers worldwide, the interface comes in five languages: English, French, Spanish, Swahili and Tamil — with translations in the works for two additional Indian languages, Hindi and Malayalam.
Spotting Banana Disease Early
Left unchecked, crop diseases can spread rapidly through infected tools, soil, water and insects. Some, like the major fungal disease Fusarium wilt, can survive for decades in soil.
Fusarium wilt has been affecting banana crops in Colombia for the last couple years — but at the start, local farmers were misidentifying the disease as viral. The misdiagnosis meant pathologists and government agencies were delayed in spotting the problem, which has since spread widely in the region.
“Monitoring and early detection is very important,” Selvaraj said. The app encourages farmers to geotag their pictures, so that researchers can flag when a disease shows up for the first time in a new area of the world. “If we had the app then, we would have gone earlier and taken some samples to confirm and avoid the outbreak.”
Pictures uploaded to Tumaini are sent to the researchers’ GPU system for inference, which takes just a few seconds depending on the user’s wireless connection. They’re also added to a database so the researchers can track global trends of banana disease.
Selvaraj and his team also plan to collect and analyze aerial images of banana crops captured by drones and the European Space Agency’s SENTINEL satellite program. By combining this remote data with GPS-tagged ground images from farmers using the app, the researchers can develop crop surveillance tools that monitor the global health of banana plants and alert local farming communities about outbreaks.
Deploying the AI tool in a smartphone app allows farmers to diagnose crop diseases in real-time in the field. Image courtesy of the International Center for Tropical Agriculture.
To broaden the scope of Tumaini, the scientists hope to add detection for additional banana diseases as well as other staple crops, like kidney beans. They’re also interested in adding resources and help lines to the app, so farmers can alert local governments about new crop diseases, contact pesticide and fungicide vendors, and learn about sustainable alternatives like biological pest control.
The team is additionally working to make the app available offline, so farmers can analyze crop images in the field, even without an internet connection.
Selvaraj says offline access and a multilingual, user-friendly interface are essential to make the app a viable solution for smallholder banana farmers. He expects demand for the app will grow further as smartphone adoption increases in Africa and India, two of the largest regions for banana production.
“AI in agriculture is still in an infant state,” he said. “We’re working today for an impact over the next 30 years.”
Main image by Wilfredo Rodríguez, licensed from Wikimedia Commons under CC BY-SA 3.0.
University of Waterloo researcher Alexander Wong didn’t have enough processing power for his computer vision startup, so he developed a workaround. That workaround is now the company’s product.
Ontario-based DarwinAI, founded by a team from the Ontario-based university, provides a platform for developers to generate slimmed-down models from neural networks. This offers a quicker way for developers to spin out multiple networks with smaller data footprints.
The company’s lean models are aimed at businesses developing AI-based edge computing networks to process mountains of sensor data from embedded systems and mobile devices.
Industries of all stripes — autonomous vehicles, manufacturing, aerospace, retail, healthcare and consumer electronics — are developing next-generation businesses with AI computing at the edge of their GPU-powered networks.
It’s estimated that by 2025 some 150 billion machine sensors and IoT devices will stream continuous data for processing.
Yet many find that talent and computing resources run high to build these various models.
DarwinAI’s position is that companies can reduce development time and costs — like DarwinAI did for themselves — by using its platform to spin out compact models from full-sized ones.
“We can enable AI at the edge for mobile devices and clients who need to put powerful neural networks into cars, watches, airplanes and other areas,” said Sheldon Fernandez, CEO and co-founder at DarwinAI.
Generative Synthesis: Hello, World
DarwinAI’s platform, dubbed GenSynth, is the result of pioneering research on what’s called generative synthesis. There’s an easy way to think of generative synthesis: It’s AI to create AI.
The startup’s founders late last year released a research paper on generative synthesis and then fused that with its proprietary research to launch the company’s offering.
DarwinAI’s platform relies on machine learning to probe and understand the architecture of neural networks for customers. Then its AI generates a new family of neural networks that are functionally equivalent to the original but smaller and faster, according to the company.
The company is a member of the NVIDIA Inception program that helps startups move to market faster.
Audi Rides DarwinAI Networks
The startup’s research has attracted interest from consumer electronics companies, aerospace and automakers, including Audi.
Audi’s case study with DarwinAI used the GenSynth platform to accelerate design of custom, optimized deep neural networks for object detection in autonomous driving.
The GenSynth platform helped Audi developers train models 4x faster and slash GPU processing time by three-fourths.
“They worked with two terrabytes of data, and we really reduced the testing time,” said Fernandez. “There’s real savings for their GPU training time and real benefits for the developers.”
GPUs for GenSynth
DarwinAI developed GenSynth to reduce its own development time, tapping into NVIDIA GPUs on AWS and Microsoft Azure and local instances on premises to boost its coding cycles.
Many of DarwinAI’s early customers are now using the platform to speed their development. It also helps reduce the data processed on customers’ systems running NVIDIA Jetson modules on site and NVIDIA V100 Tensor Core GPUs in the cloud for training and inference.
“Deep learning is so complex that you need to collaborate with AI enabled by GPUs to do it properly — it will free up your time to do the creative work,” said Fernandez.
ThirdEye Labs, a London-based company and member of Inception, NVIDIA’s startup incubator, is combining off-the-shelf CCTV cameras with state-of-the-art AI algorithms to detect fraudulent activities in stores.
Caught AI Handed
Every year, U.S. retailers lose up to $32.25 billion due to theft.
In addition to those pocketing items straight from the shelves, it’s estimated that one percent of all customers who visit self-service checkouts steal. Sometimes it’s accidental — an item doesn’t scan through properly or the wrong type of pastry is selected from the bakery menu.
But some supermarket stealers are more slick — following schemes such as “the banana trick” (steaks scanned as potatoes) or “the switcheroo” (scanning the barcodes of cheaper items, instead of a pricier purchase).
To date, retailers’ attempts to deter thieves have had little effect. Hiring more security personnel is expensive and creates unpleasant shopping experiences. While security alarms are evaded and self-service counters continue to be deceived.
ThirdEye Labs’ AI algorithms help security staff work more effectively and efficiently. Trained on NVIDIA GPUs, the company’s deep learning networks can detect specific indicators of fraudulent behavior from CCTV footage and then alert staff, who can take appropriate action on the spot.
“We chose to train our algorithms on NVIDIA GPUs as they are fast, reliable and effective,” said Raz Ghafoor, CEO and co-founder at ThirdEye Labs. “Without the power of these GPUs, our development time would have doubled.”
ThirdEye Labs’ AI software can be used with existing security infrastructure — no additional hardware or software is needed. None of the video footage used is recorded or stored anywhere and the system doesn’t perform any facial recognition, meaning the system is GDPR compliant.
At stores where ThirdEye Labs’ system has been introduced at self-service checkouts, the AI technology analyzes every scan to detect non-scans, non-payments, substitute scanning and fraudulent refunds. Over the course of a month, two stores caught 27 thieves in action, up from basically zero, by implementing ThirdEye Labs’ point-of-sale system.
In the aisles, too, fraudulent behavior hasn’t gone unnoticed. ThirdEye Labs’ “In-Aisle Theft Detector” sends security guards push notifications every time someone picks up high-risk items, like champagne bottles or fresh meat. They can then decide whether or not to take action, helping them work more efficiently and effectively.
The service has saved stores tens of thousands of dollars in losses by helping security guards have their eyes on the right person, at the right time.
The Future of Convenient Shopping
ThirdEye Labs plans to expand its technology further to improve customer shopping experiences.
Its “Queue Detector” will predict when lots of customers are about to get in checkout lines. By alerting staff, tills can be manned before the rush.
Its “Stock-out Detector” will help stores monitor their shelves and identify when stock is low. Empty shelves cost retailers an estimated three percent of their total revenue each year, so optimizing stock replenishment has big benefits for sellers as well as those looking to purchase.
ThirdEye Labs, a London-based company and member of Inception, NVIDIA’s startup incubator, is combining off-the-shelf CCTV cameras with state-of-the-art AI algorithms to detect anomalous activities in stores.
Caught AI Handed
Every year, U.S. retailers lose up to $32.25 billion due to theft.
In addition to those pocketing items straight from the shelves, it’s estimated that one percent of all customers who visit self-service checkouts steal. Sometimes it’s accidental — an item doesn’t scan through properly or the wrong type of pastry is selected from the bakery menu.
But some people follow schemes such as “the banana trick” (steaks scanned as potatoes) or “the switcheroo” (scanning the barcodes of cheaper items, instead of a pricier purchase).
To date, retailers’ attempts at deterrence have had little effect. Hiring more security personnel is expensive and creates unpleasant shopping experiences. While security alarms are evaded and self-service counters continue to be deceived.
ThirdEye Labs’ AI algorithms help security staff work more effectively and efficiently. Trained on NVIDIA GPUs, the company’s deep learning networks can detect specific indicators of fraudulent behavior from CCTV footage and then alert staff, who can take appropriate action on the spot.
“We chose to train our algorithms on NVIDIA GPUs as they are fast, reliable and effective,” said Raz Ghafoor, CEO and co-founder at ThirdEye Labs. “Without the power of these GPUs, our development time would have doubled.”
ThirdEye Labs’ AI software can be used with existing security infrastructure — no additional hardware or software is needed. None of the video footage used is recorded or stored anywhere and the system doesn’t perform any facial recognition, meaning the system is GDPR compliant.
At stores where ThirdEye Labs’ system has been introduced at self-service checkouts, the AI technology analyzes every scan to detect non-scans, non-payments, substitute scanning and fraudulent refunds. Over the course of a month, two stores flagged a few dozen suspect transactions, up from basically zero, by implementing ThirdEye Labs’ point-of-sale system.
In the aisles, too, fraudulent behavior hasn’t gone unnoticed. ThirdEye Labs’ “In-Aisle Theft Detector” sends security guards push notifications every time someone picks up high-risk items, like champagne bottles or fresh meat. They can then decide whether or not to take action, helping them work more efficiently and effectively.
The service has saved stores tens of thousands of dollars in losses by helping security guards have their eyes on the right behavior, at the right time.
The Future of Convenient Shopping
ThirdEye Labs plans to expand its technology to improve customer shopping experiences.
Its “Queue Detector” will predict when lots of customers are about to get in checkout lines. By alerting staff, tills can be manned before the rush.
Its “Stock-out Detector” will help stores monitor their shelves and identify when stock is low. Empty shelves cost retailers an estimated three percent of their total revenue each year, so optimizing stock replenishment has big benefits for sellers as well as those looking to purchase.