Learn About Our Meetup

4200+ Members

Making Waves at CVPR: Inception Startups Exhibit GPU-Powered Work in Long Beach

Computer vision technology that can identify items in a shopping bag. Deep learning tools that inspect train tracks for defects. An AI model that automatically labels street-view imagery.

These are just a few of the AI breakthroughs being showcased this week by the dozens of NVIDIA Inception startups at the annual Computer Vision and Pattern Recognition conference, one of the world’s top AI research events.

The NVIDIA Inception virtual accelerator program supports startups harnessing GPUs for AI and data science applications. Since its launch in 2016, the program has expanded over tenfold in size, to over 4,000 companies. More than 50 of them can be found in the CVPR expo hall — exhibiting GPU-powered work spanning retail, robotics, healthcare and beyond.

Malong Technologies: Giving Retailers an Edge with AI

From self-serve weighing stations that automatically identify fresh produce items in a plastic shopping bag, to smart vending machines that can recognize when a shopper takes a beverage out of a cooler — product recognition AI developed by Malong Technologies is enabling frictionless shopping experiences.

Malong’s computer vision solutions are transforming traditional retail equipment into smarter devices, enabling machines to see the products within them to improve operational efficiency, security and the customer experience.

Using the NVIDIA Metropolis platform for smart cities, the company is building product recognition AI models that enable highly accurate, real-time decisions at the edge. Malong develops powerful, scalable intelligent video analytics tools that can accurately recognize hundreds of thousands of retail products in real time. The company researches weakly-supervised learning to significantly reduce the effort to retrain their models as product packaging and store environments change.

Malong was able to speed its inferencing by more than 40x compared to CPU when using DeepStream and TensorRT software libraries on the NVIDIA T4 GPU. The company uses NVIDIA V100 GPUs in the cloud for training, and the Jetson TX2 supercomputer on a module to bring true AI computing at the edge.

At CVPR, the company is at booth 1316 on the show floor and is presenting research that achieves a new gold standard for image retrieval, outperforming prior methods by a significant margin. Malong is also co-hosting the Fine-Grained Visual Categorization Workshop and organized the first ever retail product recognition challenge at CVPR.

ABEJA: Keeping Singapore’s Metros on Track

Manually inspecting railway tracks is a dangerous task, often done by workers at night when trains aren’t running. But with high-speed cameras, transportation companies can instead capture images of the tracks and use AI to automatically detect defects for railway maintenance.

ABEJA, based in Japan, is developing deep learning models that detects anomalies on tracks with more than 90 percent accuracy, a significant improvement over other automated inspection methods. The startup works with SMRT, Singapore’s leading public transport operator, to examine rail defects.

Founded in 2012, ABEJA builds deep learning tools for multiple industries, including retail, manufacturing and infrastructure. Other use cases include an AI to measure efficiency in car factories and a natural language processing model to provide insights for call centers.

The company uses NVIDIA GPUs on premises and in the cloud for training its AI models. For inference, ABEJA has used GPUs for real-time data processing and high-performance image segmentation projects. It has also deployed projects using NVIDIA Jetson TX2 for AI inference at the edge.

The startup is showing a demo of the ABEJA annotation model in its CVPR booth.

Mapillary: AI in the Streets

Sweden-based Mapillary uses computer vision to automate mapping. Its AI models break down and classify street-level images, segmenting and labeling elements like roads, lane markings, street lights and sidewalks. The company has to date processed hundreds of millions of images submitted by individual contributors, nonprofit organizations, companies and governments worldwide.

These labeled datasets can be used for various purposes, including to create useful maps for local governments, train self-driving cars, or build tools for people with disabilities.

Mapillary is presenting four papers at CVPR this year, including one titled Seamless Scene Segmentation. The model described in the research — a new approach that joins two AI models into one, setting a new state-of-the-art for performance — was trained on eight NVIDIA V100 GPUs.

The segmentation models featured in Mapillary’s CVPR booth were also trained using V100 GPUs. By adopting the NVIDIA TensorRT inference software stack in 2017, Mapillary was able to speed up its segmentation algorithms by up to 27x when running on the Amazon Web Services cloud.

Companies interested in the NVIDIA Inception virtual accelerator can visit the program website and apply to join. Inception members are eligible for a 20 percent discount on up to six NVIDIA TITAN RTX GPUs until Oct. 26.

Startups based in the following countries can request a discount code by emailing Australia, Austria, Belgium, Canada, Czech Republic, Denmark, Finland, France, Germany, Ireland, Italy, Luxembourg, the Netherlands, Norway, Poland, Spain, Sweden, United Kingdom, United States.

The post Making Waves at CVPR: Inception Startups Exhibit GPU-Powered Work in Long Beach appeared first on The Official NVIDIA Blog.

[N] Hindsight Experience Replay (HER) with SAC/DDPG/DQN support + Evolution Strategy bridge | Stable Baselines v2.6.0

Stable Baselines 2.6.0 was just released. It comes with a bunch of new features and improvements:

– a performance tested Hindsight Experience Replay (HER) re-implementation with SAC, DDPG and DQN support included (only custom DDPG was supported in the original OpenAI Baselines)

– you can now mix Reinforcement Learning (RL) and Evolution Strategies (ES) in few lines of code, thanks to the new get/load parameters method. (see example below with A2C + CMAES)

– a guide was added in the documentation to deal wth NaNs and Infs:

Gist (for an example of mixing ES and RL):

Colab Notebook (for testing HER):


Full changelog:

submitted by /u/araffin2
[link] [comments]

Japan’s Fastest Supercomputer Adopts NGC, Enabling Easy Access to Deep Learning Frameworks

From discovering drugs, to locating black holes, to finding safer nuclear energy sources, high performance computing systems around the world have enabled breakthroughs across all scientific domains.

Japan’s fastest supercomputer, ABCI, powered by NVIDIA Tensor Core GPUs, enables similar breakthroughs by taking advantage of AI. The system is the world’s first large-scale, open AI infrastructure serving researchers, engineers and industrial users to advance their science.

The software used to drive these advances is as critical as the servers the software runs on. However, installing an application on an HPC cluster is complex and time consuming. Researchers and engineers are unproductive as they wait to access the software, and their requests to have applications installed distract system admins from completing mission-critical tasks.

Containers — packages that contain software and relevant dependencies — allow users to pull and run the software on a system without actually installing the software. They’re a win-win for users and system admins.

NGC: Driving Ease of Use of AI, Machine Learning and HPC Software

NGC offers over 50 GPU-optimized containers for deep learning frameworks, machine learning algorithms and HPC applications that run on both Docker and Singularity.

The HPC applications provide scalable performance on GPUs within and across nodes. NVIDIA continuously optimizes key deep learning frameworks and libraries, with updates released monthly. This provides users access to top performance for training and inference for all their AI projects.

ABCI Runs NGC Containers

Researchers and industrial users are taking advantage of ABCI to run AI-powered scientific workloads across domains, from nuclear physics to manufacturing. Others are taking advantage of the system’s distributed computing to push the limits on speeding AI training.

To achieve this, the right set of software and hardware tools must be in place, which is why ABCI has adopted NGC.

“Installing deep learning frameworks from the source is complicated and upgrading the software to keep up with the frequent releases is a resource drain,” said Hirotaka Ogawa, team leader of the Artificial Intelligence Research Center at AIST. “NGC allows us to support our users with the latest AI frameworks and the users enjoy the best performance they can achieve on NVIDIA GPUs.”

ABCI has turned to containers to address another user need — portability.

“Most of our users are from industrial segments who are looking for portability between their on-prem systems and ABCI,” said Ogawa. “Thanks to NGC and Singularity, the users can develop, test, and deploy at scale across different platforms. Our sampling data showed that NGC containers were used by 80 percent of the over 100,000 jobs that ran on Singularity.”

NGC Container Replicator Simplifies Ease of Use for System Admins and Users

System admins managing HPC systems at supercomputing centers and universities can now download and save NGC containers on their clusters. This gives users faster access to the software, alleviates their network traffic, and saves storage space.

NVIDIA offers NGC Container Replicator, which automatically checks and downloads the latest versions of NGC containers.

NGC container replicator chart

Without lifting a finger, system admins can ensure that their users benefit from the superior performance and newest features from the latest software.

More Than Application Containers

In addition to deep learning containers, NGC hosts 60 pre-trained models and 17 model scripts for popular use cases like object detection, natural language processing and text to speech.

It’s much faster to tune a pre-trained model for a use case than to start from scratch. The pre-trained models allow researchers to quickly fine-tune a neural network or build on top of an already optimized network for specific use-case needs.

The model training scripts follow best practices, have state-of-the-art accuracy and deliver superior performance. They’re ideal for researchers and data scientists planning to build a network from scratch and customize it to their liking.

The models and scripts take advantage of mixed precision powered by NVIDIA Tensor Core GPUs to deliver up to 3x deep learning performance speedups over previous generations.

Take NGC for a Spin

NGC containers are built and tested to run on-prem and in the cloud. They also support hybrid as well as multi-cloud deployments. Visit, pull your application container on any GPU-powered system or major cloud instance, and see how easy it is to get up and running for your next scientific research.

The post Japan’s Fastest Supercomputer Adopts NGC, Enabling Easy Access to Deep Learning Frameworks appeared first on The Official NVIDIA Blog.

[R] How can I improve my material segmentations?

I am trying to perform material segmentation (essentially semantic segmentation with respect to materials) on street-view imagery. My datasets only has ground truth for select regions, so not all pixels have a label, and I calculate loss and metrics only within these ground truth regions. I use Semantic FPN (with the ResNet-50 backbone pre-trained on ImageNet), a learning rate of 0.001, momentum of 0.8, and learning rate is divided by 4 if there is no validations loss improvement after three epochs. My loss function is a per-pixel multiclass cross-entropy loss.

My dataset is extremely limited. Not only are not all pixels classified, I also only have 700 images and a severe class imbalance. I tried tackling this imbalance through loss class weighting (based on the number of ground truth pixels for each respective class, i.e. their area sizes), but it barely helps. I also possess, for every image, a depth map, which I (can) supply as a fourth channel to the input layer.

A table of results

Visualizations of images trained only on RGB

Visualizations of images trained on RGBD

Visualizations of images trained only on RGB, but with class loss weighting

Visualizations of images trained only RGBD, and with class loss weighting

Performance is pretty crappy. What’s more, there is very little difference between results of my four experiments. Why is this? I would expect that the addition of depth information (which encodes surface normals and perhaps texture information; pretty discriminitive information). Besides the overall metrics being rather low, the predictions are very messy, and the networks rarely, if ever, predicts “small” classes (in terms of area size), e.g. plastic or gravel. This is to be expected with such a small amount of data, but I was wondering if there are any “performance hacks” that can boost my network, or if I am missing any obvious stuff? Or is data likely the only bottleneck here? Any suggestions are greatly appreciated!

PS. I also tried a simple ResNet-50 FCN (I simply upsample ResNet’s output until I have the same resolution; there aren’t even skip connections), and the results are worse, but at least they are smooth. Why are these more smooth?

submitted by /u/EmielBoss
[link] [comments]

[D] Do (or have) you ever work on a project for months, to almost abandon it and finally get decent results in the end?

As the title says: Do or have you ever worked on a project for months straight, to almost abandon it because e.g. models do not converge, to finally get “lucky” and get decent results in the end?

The above just happened to me, where i have been working full-time on something for the past 4 months. Without getting decent results and getting completely out of options, almost thinking the whole project would have failed. To finally find out that in the end, due to some ‘luck’, the model turned out quite well with results where i’m happy with.

How often does this happen in the field and what are your experiences with such projects?

Is there any advice you would give out to others who are stuck on such a problem?

submitted by /u/xHipster
[link] [comments]

[Project] Fixed input and variable output neural network

Hey everyone,

I need some advice on choosing a neural network type which is suitable for the application described below.

I have a data set with 39600 samples/entries, each sample has an image and a corresponding vector of variable length.

I want to create a neural network capable of predicting the vector associated with image based solely on the image.

So, I need a neural network which accepts a fixed length input (the image) and outputs a vector of variable length.

How can this be achieved?

Thank you.

submitted by /u/Pilo290
[link] [comments]

[R] Research roadblock. Help with Extreme Multi-Label Classification

Hey guys!

I have been working on the BioASQ challenge, Task A which is the large scale semantic indexing of PubMed abstracts. It is supposed to be my Master’s thesis but I have hit a roadblock.

The current state-of-the-art results, that is if we concern ourselves with just the micro-f score, is 68.8% while I can’t seem to get past the 60% mark. I am currently using pre-trained bio-medical FastText word vectors with a bidirectional GRU, the output of which branches out into two parts. The first part computes a document vector using attention mechanism while the second part applies a CNN and then k-max pooling to get yet another document representation. Both vectors are merged along with some additional hand-crafted features which are then finally fed to the output layer which is of size 28,472 (the total number of labels) with sigmoid activation and binary cross entropy loss. Upon training this architecture on 3 million abstracts, I am getting a micro-f score of 58.2%.

I have tried a number of other methods and architectures but none are working. It is extremely frustrating since I have made absolutely no progress for the entirety of this month and I am growing anxious with every passing day as my deliverable deadline keeps coming closer. It would be of immense help if anyone could point me in the right direction on how to proceed further. What to read, what to change, etc. I did read about Label wise attention networks but cannot understand how to implement that in Keras. A small hint or some pseudocode would be of great help.

submitted by /u/atif_hassan
[link] [comments]

Next Meetup




Plug yourself into AI and don't miss a beat


Toronto AI is a social and collaborative hub to unite AI innovators of Toronto and surrounding areas. We explore AI technologies in digital art and music, healthcare, marketing, fintech, vr, robotics and more. Toronto AI was founded by Dave MacDonald and Patrick O'Mara.