Learn About Our Meetup

4500+ Members

[D] ML Inference optimization, runtimes, compilers

I’m doing a study on inference latency. What are different ways of optimizing your model for this? Let’s say the goal is to get your inference latency as low as possible. I’ve heard of ONNX runtime (apparently used by Microsoft in production), compilers such as Intel nGraph, TVM, Intel OpenVINO and so on. Are these kind of tools used in production, or do most companies just use PyTorch and TF inference mode? If anyone here has experience from unique deployments I’d love to hear about it!

submitted by /u/dilledalle
[link] [comments]

[D] Are small transformers better than small LSTMs?

Transformers are currently beating the state of the art on different NLP tasks.

Some examples are:

  • Machine translation: Transformer Big + BT
  • Named entity recognition: BERT large
  • Natural language inference: RoBERTa

Something I noticed is that in all of the papers, the models are massive with maybe 20 layers and 100s of millions of parameters.

Of course, using larger models is a general trend in NLP but it begs the question if small transformers are any good. I recently had to train a sequence to sequence model from scratch and I was unable to get better results with a transformer than with LSTMs.

I am wondering if someone here has had similar experiences or knows of any papers on this topic.

submitted by /u/djridu
[link] [comments]

[D] Looking for suggestions for biomedical datasets similar to the Wisconsin Breast cancer database

I am looking for biomedical databases similar to the Wisconsin breast cancer database (available at ). This database has 9 features (each feature values being integers ranging from 1 to 10) and two classes – benign and malignant. Defining characteristic of this dataset is that the higher feature values generally indicate higher chance of abnormality (malignancy). I am looking for other biomedical datasets having features with this property (not necessarily integer valued, can also be real valued; preferably with low number of features also – less than 30 or so)

submitted by /u/daffodils123
[link] [comments]

[D] What is the current state-of-art in unsupervised document/information retrieval for NLP tasks?

Hello everybody,

Are there any good unsupervised methods of retrieving top-k documents from corpus based on a rather short query?

I did a bit of googling but couldn’t find anything that isn’t tf-idf based.

Maybe it would be possible to somehow retrieve similarities between docs and query by utilising contextual embeddings (such as from BERT) and use some sort of scoring function to evaluate it.

Anyway, thank you in advance for your answers.

submitted by /u/Slowai
[link] [comments]

AWS supports the Deepfake Detection Challenge with competition data and AWS credits

Today AWS is pleased to announce that it is working with Facebook, Microsoft, and the Partnership on AI on the first Deepfakes Detection Challenge.  The competition, to which we are contributing up to $1 million in AWS credits to researchers and academics over the next two years, is designed to produce technology that can be deployed to better detect when artificial intelligence has been used to alter a video in order to mislead the viewer. We plan to host the full competition dataset when it is made available later this year, and are offering the support of Amazon machine learning experts to help teams get started. We want to ensure access to this data for a diverse set of participants with varied perspectives to help develop the best possible solutions to combat the growing problem of “deepfakes.”

The same technology which has given us delightfully realistic animation effects in movies and video games, has also been used by bad actors to blur the distinction between reality and fiction. “Deepfake” videos manipulate audio and video using artificial intelligence to make it appear as though someone did or said something they didn’t. These techniques can be packaged up in to something as simple as a cell phone app, and are already being used to deliberately mislead audiences by spreading fake viral videos through social media. The fear is that deepfakes may become so realistic that they will be used to the detriment of reputations, to sway popular opinion, and could in time make any piece of information suspicious.

The Deepfakes Detection Challenge invites participants to build new approaches that can detect deepfake audio, video, and other tampered media. The challenge will kick off in December at the NeurIPS Conference with the release of a new dataset generated by Facebook which comprises tens of thousands of example videos, both real and fake. Competitors will use this dataset to design novel algorithms which can detect a real or fake video, and the algorithms will be evaluated against a secret test dataset (which will not be made available to ensure there is a standard, scientific evaluation of entries).

Building deepfake detectors will require novel algorithms which can process this vast library of data (more than 4 petabytes). AWS will work with DFDC partners to explore options for hosting the data set, including the use of Amazon S3, and we will make $1 million in AWS credits available to develop and test these sophisticated new algorithms. All participants will be able to request a minimum of $1,000 in AWS credits to get started, with additional awards granted in quantities of up to $10,000 as entries demonstrate viability or success in detecting deepfakes. Participants can visit to learn more and request AWS credits.

The Deepfakes Detection Challenge steering committee is sharing the first 5,000 videos of the dataset with researchers working in this field. The group will collect feedback and host a targeted technical working session at the International Conference on Computer Vision (ICCV) in Seoul beginning on October 27, 2019. Following this due diligence, the full data set release and the launch of the Deepfakes Detection Challenge will coincide with the Conference on Neural Information Processing Systems (NeurIPS) this December.

To support participants in this endeavor, AWS will also be providing access to Amazon ML Solutions Lab experts and solutions architects to help provide technical support and guidance to contestants to help teams get started in the challenge. The Amazon ML Solutions Lab is a dedicated service offering for AWS customers that provides access to the same talent that built many of Amazon’s machine learning-powered products and services. These Amazon experts help AWS customers utilize machine learning technology to build intelligent solutions that to address some of the world’s toughest challenges like predicting famine, identifying cancer faster, and expediting assistance to areas hard hit by natural disasters. Amazon ML Solutions Lab experts will be paired with Challenge participants to provide assistance throughout the competition.

In addition to serving as a founding member of the Partnership on AI, AWS is also joining the non-profit’s Steering Committee on AI and Media Integrity. The goal, as with sponsorship of the Deepfakes Deception Challenge, is to coordinate the activities of media, tech companies, governments, and academia to promote technologies and policies that strengthen trust in media and help audiences differentiate fact from fiction.

To learn more about the Deepfakes Detection Challenge and receive updates on how to register and participate, visit Stay tuned for more updates as we get closer to kick-off!


About the Author

Michelle Lee is vice president of the Machine Learning Solutions Lab at AWS.



[D] Retrain your models, the Adam optimizer in PyTorch was fixed in version 1.3

I have noticed a small discrepancy between theory and the implementation of AdamW and in general Adam. The epsilon in the denominator of the following Adam update should not be scaled by the bias correction (Algorithm 2, L9-12). Only the running average of the gradient (m) and squared gradients (v) should be scaled by their corresponding bias corrections.

In the current implementation, the epsilon is scaled by the square root of bias_correction2
. I have plotted this ratio as a function of step given beta2 = 0.999
and eps = 1e-8
. In the early steps of optimization, this ratio slightly deviates from theory (denoted by the horizontal red line)

See more here:

submitted by /u/Deepblue129
[link] [comments]

5G Meets AI: NVIDIA CEO Details ‘Smart Everything Revolution,’ EGX for Edge AI, Partnerships with Leading Companies

The smartphone revolution that’s swept the globe over the past decade is just the start, NVIDIA CEO Jensen Huang declared Monday.

Next up: the “smart everything revolution,” Huang told a crowd of hundreds from telcos, device manufacturers, developers, and press at his keynote ahead of the Mobile World Congress gathering in Los Angeles this week.

“The smartphone revolution is the first of what people will realize someday is the IoT revolution, where everything is intelligent, where everything is smart,” Huang said. He squarely positioned NVIDIA to power AI at the edge of enterprise networks and in the virtual radio access networks – or vRANs – powering next-generation 5G wireless services.

Among the dozens of leading companies joining NVIDIA as customers and partners cited during Huang’s 90 minute address are WalMart — which is already building NVIDIA’s latest technologies into its showcase Intelligent Retail Lab — BMW, Ericsson, Microsoft, NTT, Procter & Gamble, Red Hat, and Samsung Electronics.

Anchoring NVIDIA’s story: the NVIDIA EGX edge supercomputing platform, a high-performance cloud-native edge computing platform optimized to take advantage of three key revolutions – AI, IoT and 5G – providing the world’s leading companies the ability to build next-generation services.

“The smartphone moment for edge computing is here and a new type of computer has to be created to provision these applications,” said Huang speaking at the LA Convention Center. He noted that if the global economy can be made just a little more efficient with such pervasive technology, the opportunity can be measured in “trillions of dollars per year.”

Ericsson Exec Joins on Stage Marking Collaboration

Ericsson’s Fredrik Jejdling, executive vice president and head of business area networks joined NVIDIA CEO Jensen Huang on stage to announce Ericsson and NVIDIA’s collaboration on 5G radio.

A key highlight: a new collaboration on 5G with Erisson to build high-performance software-defined radio access networks.

Joining Jensen on stage was Ericsson’s Fredrik Jejdling, executive vice president and head of business area networks. The company is a leader in the radio access network industry, one of the key building blocks for high-speed wireless networks.

“As an industry we’ve, in all honesty, been struggling to find alternatives that are better and higher performance than our current bespoke environment,” Jejdling said. “Our collaboration is figuring out an efficient way of providing that, combining your GPUs with our heritage.”

The collaboration brings Ericsson’s expertise in radio access network technology together with NVIDIA’s leadership in high-performance computing to fully virtualize the 5G Radio, giving telcos unprecedented flexibility.

Together NVIDIA and Ericsson are innovating to fuse 5G, supercomputing and AI for a revolutionary communications platform that will someday support trillions of always-on devices.

Red Hat, NVIDIA to Create Carrier-Grade Telecommunications Infrastructure

Red Hat, NVIDIA to create carrier-grade telecommunications infrastructure.

Huang also announced a new collaboration with Red Hat to building carrier-grade cloud native telecom infrastructure with EGX for AI, 5G RAN and other workloads.  The enterprise software provider already serves 120 telcos around the world, powering every member of the Fortune 500.

Together, NVIDIA and Red Hat will bring carrier-grade Kubernetes — which automates the deployment, scaling, and management of applications – to telcos so they can orchestrate and manage 5G RANs in a truly-software defined mobile edge.

“Red Hat is joining us to integrate everything we’re working on and make it a carrier grade stack,” Huang said. “The rest of the industry has joined us as well, every single data center computer maker, the world’s leading enterprise software makers, have all joined us to take this platform to market.”

Introducing the NVIDIA EGX edge supercomputing platform, a high-performance cloud-native edge computing platform optimized to take advantage of three key revolutions – AI, IoT and 5G.

NVIDIA Aerial to Accelerate 5G

For carriers, Huang also announced NVIDIA Aerial, a CUDA-X software developer kit running on top of EGX.

Aerial allows telecommunications companies to build completely virtualized 5G radio access networks that are highly programmable, scalable and energy efficient — enabling telcos to offer new AI services such as smart cities, smart factories, AR/VR and cloud gaming.

Technology for the Enterprise Edge

In addition to telcos, enterprises will also increasingly need high performance edge servers to make decisions from large amounts of data in real-time using AI.

EGX combines NVIDIA CUDA-X software, a collection of NVIDIA libraries that provide a flexible and high-performance programing language to developers,  with NVIDIA-certified GPU servers and devices.

The result enables companies to harness rapidly streaming data — from factory floors to manufacturing inspection lines to city streets — delivering AI and other next-generation services.

Microsoft, NVIDIA Technology Collaboration

To offer customers an end-to-end solution from edge to cloud, Microsoft and NVIDIA are working together in a new collaboration to more closely integrate Microsoft Azure with EGX. In addition, NVIDIA T4 GPUs are featured in a new form factor of Microsoft’s Azure Data Box edge appliance.

Other top technology companies collaborating with NVIDIA on the EGX platform include Cisco, Dell Technologies, Hewlett Packard Enterprise, Mellanox and VMware.

Walmart Adopts EGX to Create Store of the Future

Huang cited Walmart as an example of EGX’s power.

The retail giant is deploying it in its Levittown, New York, Intelligent Retail Lab. It’s a unique, fully operating grocery store where the retail giant explores the ways AI can further improve in-store shopping experiences.

Walmart is deploying EGX in its Levittown, New York, Intelligent Retail Lab.

Using EGX’s advanced AI and edge capabilities, Walmart can compute in real time more than 1.6 terabytes of data generated per second. This helps it use to automatically alert associates to restock shelves, open up new checkout lanes, retrieve shopping carts and ensure product freshness in meat and produce departments.

Just squeezing out a half a percent of efficiencies in the $30 trillion retail opportunity represents an enormous opportunity, Huang noted. “The opportunity for using automation to improve efficiency in retail is extraordinary,” Huang said.

BMW, Procter & Gamble, Samsung, Among Leaders Adopting EGX

That power is already being harnessed for a dizzying array of real-world applications across the world:

  • Korea’s Samsung Electronics, in another early EGX deployment, is using AI at the edge for highly complex semiconductor design and manufacturing processes.
  • Germany’s BMW is using intelligent video analytics and EGX edge servers in its South Carolina manufacturing facility to automate inspection.
  • Japan’s NTT East uses EGX in its data centers to develop new AI-powered services in remote areas through its broadband access network.
  • The U.S.’s Procter & Gamble the world’s top consumer goods company, is working with NVIDIA to develop AI-enabled applications on top of the EGX platform for the inspection of products and packaging.

Cities, too, are grasping the opportunity. Las Vegas uses EGX to capture vehicle and pedestrian data to ensure safer streets and expand economic opportunity.  And San Francisco’s prime shopping area, the Union Square Business Improvement District, uses EGX to capture real-time pedestrian counts for local retailers.

Stunning New Possibilities

To demonstrate the possibilities, Huang punctuated his keynote with demos showing what AI can unleash in the world around us.

In a flourish that stunned the crowd, Huang made a red McLaren Senna prototype — which carries a price of a hair under $1 million — materialize on stage in augmented reality. It could be viewed from any angle — including from the inside — on a smartphone streaming data over Verizon’s 5G network from a Verizon data center in Los Angeles

The technology behind the demo: Autodesk VRED running in a virtual machine on a Quadro RTX 8000 server. On the phone: a 5G client build with NVIDIA’s CloudXR client application software development kit for mobile devices and head mounted displays.

And, in a video, Huang showed how the Jarvis multi-modal AI was able to to follow queries from two different speakers conversing on different topics, the weather and restaurants, as they drove down the road – reacting to what the computer sees as well as what is said.

In another video, Jarvis guided a shopper through a purchase in a real-world store.

“In the future these kind of multi-modal AIs will make the conversation and the engagement you have with the AI much much better,” Huang said.

Cloud Gaming Goes Global

Huang also detailed how NVIDIA is expanding its cloud gaming network through partnerships with global telecommunications companies.

GeForce NOW, NVIDIA’s cloud gaming service, transforms underpowered or incompatible devices into a powerful GeForce gaming PC with access to popular online game stores.

Taiwan Mobile joins industry leaders rolling out GeForce NOW, including Korea’s LG U+, Japan’s Softbank, and Russia’s Rostelecom in partnership with GFN.RU. Additionally, Telefonica will kick-off a cloud gaming proof-of-concept in Spain.

Huang showed what’s now possible with a real-time demo of a gamer playing Assetto Corsa Competizione on GeForce Now — as a cameraman watched over his shoulder — on a smartphone over a 5G network. The gamer navigated through the demanding racing game’s action with no noticeable lag.

The mobile version of GeForce NOW for Android devices is available in Korea and will be available widely later this year, with a preview on display at Mobile World Congress Los Angeles.

“These servers are going to be the same servers that run intelligent agriculture and intelligent retail,” Huang said. “The future is software defined and these low latency services that need to be deployed at the edge can now be provisioned at the edge with these servers.”

A Trillion New Devices

The opportunities for AI, IoT, cloud gaming, augmented reality and 5G network acceleration are huge — with a trillion new IoT devices to be produced between now and 2035, according to industry estimates.

And GPUs are up to the challenge, with GPU computing power growing 300,000x from 2013, driving down the cost per teraflop of computing power, even as gains in CPU performance level off, Huang said.

NVIDIA is well positioned to help telcos and enterprises make the most of this by helping customers combine AI algorithms, powerful GPUs, smart NICs — or network interface cards, cloud native technologies, the NVIDIA EGX accelerated edge computing platform, and 5G high-speed wireless networks.

Huang compared all these elements to the powerful “infinity stones” featured in Marvel’s movies and comic books.

“What you’re looking at are the six miracles that will make it possible to put 5G at the edge, to virtualize the 5G data center and create a world of smart everything,” Huang said, and that, in turn, will add intelligence to everything in the world around us.

“This will be a pillar, a foundation for the smart everything revolution,” Huang said.

The post 5G Meets AI: NVIDIA CEO Details ‘Smart Everything Revolution,’ EGX for Edge AI, Partnerships with Leading Companies appeared first on The Official NVIDIA Blog.

Next Meetup




Plug yourself into AI and don't miss a beat


Toronto AI is a social and collaborative hub to unite AI innovators of Toronto and surrounding areas. We explore AI technologies in digital art and music, healthcare, marketing, fintech, vr, robotics and more. Toronto AI was founded by Dave MacDonald and Patrick O'Mara.