Learn About Our Meetup

4500+ Members

Author: torontoai

Expanding Universe for HPC, NVIDIA CEO Brings GPU Acceleration to Arm

Broadening support for GPU-accelerated supercomputing to a fast-growing new platform, NVIDIA founder and CEO Jensen Huang Monday introduced a reference design for building GPU-accelerated Arm servers, with wide industry backing.

Huang — speaking Monday at the SC19 supercomputing show in Denver — also announced that Microsoft has built NDv2, a “supersized instance” that’s the world’s largest GPU-accelerated cloud-based supercomputer — a supercomputer in the cloud — on its Azure cloud-computing platform.

He additionally unveiled NVIDIA Magnum IO, a suite of GPU-accelerated I/O and storage software to eliminate data transfer bottlenecks for AI, data science and HPC workloads.

In a two-hour talk, Huang wove together these announcements with an update on developments from around the industry, setting out a sweeping vision of how high performance computing is expanding out in all directions.

HPC Universe Expanding in All Directions

“The HPC universe is expanding in every single direction at the same time,” Huang told a standing-room only crowd of some 1,400 researchers and technologists at the start of the world’s biggest supercomputing event. “HPC is literally everywhere today. It’s in supercomputing centers, in the cloud, at the edge.”

Driving that expansion are factors such as streaming HPC from massive sensor arrays; using edge computing to do more sophisticated filtering; running HPC in the cloud; and using AI to accelerate HPC.

“All of these are undergoing tremendous change,” Huang said.

Putting an exclamation mark on his talk, Huang debuted the world’s largest interactive volume visualization: An effort with NASA to simulate a Mars landing in which a craft the size of a two-story condominium traveling at 12,000 miles an hour screeches safely to a halt in just seven minutes. And it sticks the landing.

Huang said the simulation enables 150 terabytes of data, equivalent to 125,000 DVDs, to be flown through at random access. “To do that, we’ll have a supercomputing analytics instrument that sits next to a supercomputer.”

Expanding the Universe for HPC

Kicking off his talk, Huang detailed how accelerated computing powers the work of today’s computational scientists, whom he calls the da Vincis of our time.

The first AI supercomputers already power scientific research into phenomena as diverse as fusion energy and gravitational waves, Huang explained.

Accelerated computing, meanwhile, powers exascale systems tackling some of the world’s most challenging problems.

They include efforts to identify extreme weather patterns at Lawrence Berkeley National Lab … Research into the genomics of opioid addiction at Oak Ridge National Laboratory … Nuclear waste remediation efforts led by LBNL, the Pacific Northwest National Lab and Brown University at the Hanford site … And cancer-detection research led by Oak Ridge National Laboratory and the State University of New York at Stony Brook.

At the same time, AI is being put to work across an ever-broader array of industries. Earlier this month, the U.S. Post Office, the world’s largest delivery service — which processes nearly 500 million pieces of mail a day — announced it’s adopting end-to-end AI technology from NVIDIA.

“It’s the perfect application for a streaming AI computer,” Huang said.

And last month, in partnership with Ericsson, Microsoft, Red Hat and others, Huang revealed that NVIDIA is powering AI at the edge of enterprise and 5G telco networks with the NVIDIA EGX Edge Supercomputing platform.

Next up for HPC: harnessing vast numbers of software-defined sensors to relay data to programmable edge computers, which in turn pass on the most interesting data to supercomputers able to wring insights out of oceans of real-time data.

Arm in Arm: GPU-Acceleration Speeds Emerging HPC Architecture

Monday’s news marks a milestone for the Arm community. The processor architecture — ubiquitous in smartphones and IoT devices — has long been the world’s most popular. Arm has more than 100 billion computing devices and will cross the trillion mark in the coming years, Huang predicted.

NVIDIA’s moving fast to bring HPC tools of all kinds to this thriving ecosystem.

“We’ve been working with the industry, all of you, and the industry has really been fantastic, everybody is jumping on,” Huang said, adding that 30 applications are already up and running. “This is going to be a great ecosystem — basically everything that runs in HPC should run on any CPU as well.”

World-leading supercomputing centers have already begun testing GPU-accelerated Arm-based computing systems, Huang said. This includes Oak Ridge and Sandia National Laboratories, in the United States; the University of Bristol, in the United Kingdom; and Riken, in Japan.

NVIDIA’s reference design for GPU-accelerated Arm servers — comprising both hardware and software building blocks — has already won support from key players in HPC and Arm ecosystems, Huang said.

In the Arm ecosystem, NVIDIA is teaming with Arm, Ampere, Fujitsu and Marvell. NVIDIA is also working with Cray, a Hewlett Packard Enterprise company, and HPE. A wide range of HPC software companies are already using NVIDIA CUDA-X libraries to bring their GPU-enabled management and monitoring tools to the Arm ecosystem.

The reference platform’s debut follows NVIDIA’s announcement earlier this year that it will bring its CUDA-X software platform to Arm. Fulfilling this promise, NVIDIA is previewing its Arm-compatible software developer kit — available for download now — consisting of NVIDIA CUDA-X libraries and development tools for accelerated computing.

Microsoft Brings GPU-Powered Supercomputer to Azure

“This puts a supercomputer in the hands of every scientist in the world,” Huang said he announced NDv2, a GPU-powered supercomputer now available on Microsoft Azure.

Giving HPC researchers and others instant access to unprecedented amounts of GPU computing power, Huang announced NDv2, a GPU-powered supercomputer now available on Microsoft Azure that ranks among the world’s fastest.

“Now you can open up an instance, you grab one of the stacks … in the container, you launch it, on Azure, and you’re doing science,” Huang said. “It’s really quite fantastic.”

Built to handle the most demanding AI and HPC applications, the Azure NDv2 instance can scale up to 800 NVIDIA V100 Tensor Core GPUs interconnected with Mellanox InfiniBand.

For the first time, researchers and others can rent an entire AI supercomputer on demand, matching the capabilities of large-scale, on-premise supercomputers that can take months to deploy.

AI researchers needing fast solutions can quickly spin up multiple Azure NDv2 instances and train complex conversational AI models in just hours, Huang explained.

For example, Microsoft and NVIDIA engineers used 64 NDv2 instances on a pre-release version of the cluster to train BERT, a popular conversational AI model, in roughly three hours.

Magnum IO Software

Helping AI researchers and data scientists move data in minutes, rather than hours, Huang introduced the NVIDIA Magnum IO software suite.

A standing-room only crowd of some 1,400 researchers and technologists came to hear NVIDIA’s keynote at the start of SC19, the world’s top supercomputing event.

Delivering up to 20x faster data processing for multi-server, multi-GPU computing nodes, Mangum IO eliminates a key bottleneck faced by those carrying out complex financial analysis, climate modeling and other high-performance workloads.

“This is an area that is going to be rich with innovation, and we are going to be putting a lot of energy into helping you move information in and out of the system,” Huang said.

A key feature of Magnum IO is NVIDIA GPUDirect Storage, which provides a direct data path between GPU memory and storage, enabling data to bypass CPUs and travel unencumbered on “open highways” offered by GPUs, storage and networking devices.

NVIDIA developed Magnum in close collaboration with industry leaders in networking and storage, including DataDirect Networks, Excelero, IBM, Mellanox and WekaIO.

The post Expanding Universe for HPC, NVIDIA CEO Brings GPU Acceleration to Arm appeared first on The Official NVIDIA Blog.

NVIDIA and Partners Bring AI Supercomputing to Enterprises

Academia, hyperscalers and scientific researchers have been big beneficiaries of high performance computing and AI infrastructure. Yet businesses have largely been on the outside looking in.

No longer. NVIDIA DGX SuperPOD provides businesses a proven design formula for building and running enterprise-grade AI infrastructure with extreme scale. The reference architecture gives businesses a prescription to follow to avoid exhaustive, protracted design and deployment cycles and capital budget overruns.

Today, at SC19, we’re taking DGX SuperPOD a step further. It’s available as a consumable solution that now integrates with the leading names in data center IT — including DDN, IBM, Mellanox and NetApp — and is fulfilled through a network of qualified resellers. We’re also working with ScaleMatrix to bring self-contained data centers in a cabinet to the enterprise.

The Rise of the Supercomputing Enterprise

AI is an accelerant for gaining competitive advantage. It can open new markets and even address a business’s existential threats. Formerly untrainable models for use cases like natural language processing become solvable with massive infrastructure scale.

But leading-edge AI demands leadership-class infrastructure — and DGX SuperPOD offers extreme-scale multi-node training of even the most complex models, like BERT for conversational AI.

It consolidates often siloed pockets of AI and machine learning development into a centralized shared infrastructure, bringing together data science talent so projects can quickly go from concept to production at scale.

And it maximizes resource efficiency, avoiding stranded, underutilized assets and increasing a business’s return on its infrastructure investments.

Data Center Leaders Support NVIDIA DGX SuperPOD 

Several of our partners have completed the testing and validation of DGX SuperPOD in combination with their high-performance storage offerings and the Mellanox InfiniBand and Ethernet terabit-speed network fabric.

DGX SuperPOD with IBM Spectrum Storage

“Deploying faster with confidence is only one way our clients are realizing the benefits of the DGX SuperPOD reference architecture with IBM Storage,” said Douglas O’Flaherty, director of IBM Storage Product Marketing. “With comprehensive data pipeline support, they can start with an all-NVMe ESS 3000 flash solution and adapt quickly. With the software-defined flexibility of IBM Spectrum Scale, the DGX SuperPOD design easily scales, extends to public cloud, or integrates IBM Cloud Object Storage and IBM Spectrum Discover. Supported by the expertise of our business partners, we enhance data science productivity and organizational adoption of AI.”

DGX SuperPOD with DDN Storage

“Meeting the massive demands of emerging large-scale AI initiatives requires compute, networking and storage infrastructure that exceeds architectures historically available to most commercial organizations,” said James Coomer, senior vice president of products at DDN. “Through DDN’s extensive development work and testing with NVIDIA and their DGX SuperPOD, we have demonstrated that it is now possible to shorten supercomputing-like deployments from weeks to days and deliver infrastructure and capabilities that are also rock solid and easy to manage, monitor and support. When combined with DDN’s A3I data management solutions, NVIDIA DGX SuperPOD creates a real competitive advantage for customers looking to deploy AI at scale.”

DGX SuperPOD with NetApp

“Industries are gaining competitive advantage with high performance computing and AI infrastructure, but many are still hesitant to take the leap due to the time and cost of deployment,” said Robin Huber, vice president of E-Series at NetApp. “With the proven NVIDIA DGX SuperPOD design built on top of the award-winning NetApp EF600 all-flash array, customers can move past their hesitation and will be able to accelerate their time to value and insight while controlling their deployment costs.”

NVIDIA has built a global network of partners who’ve been qualified to sell and deploy DGX SuperPOD infrastructure:

  • In North America: Worldwide Technologies
  • In Europe, the Middle East and Africa: ATOS
  • In Asia: LTK and Azwell
  • In Japan: GDEP

To get started, read our solution brief and then reach out to your preferred DGX SuperPOD partner.

Scaling Supercomputing Infrastructure — Without a Data Center

Many organizations that need to scale supercomputing simply don’t have access to a data center that’s optimized for the unique demands of AI and HPC infrastructure. We’re partnering with ScaleMatrix, a DGX Ready Data Center Program partner, to bring self-contained data centers in a rack to the enterprise.

In addition to colocation services for DGX infrastructure, ScaleMatrix offers its Dynamic Density Control cabinet technology, which enables businesses to bypass the constraints of data center facilities. This lets enterprises deploy DGX POD and SuperPOD environments almost anywhere while delivering the power and technology of a state-of-the-art data center.

With self-contained cooling, fire suppression, various security options, shock mounting, extreme environment support and more, the DDC solution offered through our partner Microway removes the dependency on having a traditional data center for AI infrastructure.

Learn more about this offering here.

The post NVIDIA and Partners Bring AI Supercomputing to Enterprises appeared first on The Official NVIDIA Blog.

[P] MLP output of first layer is zero after one epoch

I’ve been running into an issue lately trying to train a simple MLP.

I’m basically trying to get a network to map the XYZ position and RPY orientation of the end-effector of a robot arm (6-dimensional input) to the angle of every joint of the robot arm to reach that position (6-dimensional output), so this is a regression problem.

I’ve generated a dataset using the angles to compute the current position, and generated datasets with 5k, 500k and 500M sets of values.

My issue is the MLP I’m using doesn’t learn anything at all. Using Tensorboard (I’m using Keras), I’ve realized that the output of my very first layer is always zero (see Image1 ), no matter what I try.

Basically, my input is a shape (6,) vector and the output is also a shape (6,) vector.

Here is what I’ve tried so far, without success:

  • I’ve tried MLPs with 2 layers of size 12, 24; 2 layers of size 48, 48; 4 layers of size 12, 24, 24, 48.
  • Adam, SGD, RMSprop optimizers
  • Learning rates ranging from 0.15 to 0.001, with and without decay
  • Both Mean Squared Error (MSE) and Mean Absolute Error (MAE) as the loss function
  • Normalizing the input data, and not normalizing it (the first 3 values are between -3 and +3, the last 3 are between -pi and pi)
  • Batch sizes of 1, 10, 32
  • Tested the MLP of all 3 datasets of 5k values, 500k values and 5M values.
  • Tested with number of epoches ranging from 10 to 1000
  • Tested multiple initializers for the bias and kernel.
  • Tested both the Sequential model and the Keras functional API (to make sure the issue wasn’t how I called the model)
  • All 3 of sigmoid, relu and tanh activation functions for the hidden layers (the last layer is a linear activation because its a regression)

Additionally, I’ve tried the very same MLP architecture on the basic Boston housing price regression dataset by Keras, and the net was definitely learning something, which leads me to believe that there may be some kind of issue with my data. However, I’m at a complete loss as to what it may be as the system in its current state does not learn anything at all, the loss function just stalls starting on the 1st epoch.

Any help or lead would be appreciated, and I will gladly provide code or data if needed!

Thank you

submitted by /u/deathlymonkey
[link] [comments]

[P] Update: DepthAI hardware: Demo video MobileNetSSD (20class) running at 25FPS

u/Luxonis-Brandon put together a video demonstrating the real-time speed of the DepthAI.

The device is something we’ve been working on that combines disparity depth and AI via Intel’s Myriad X VPU. We’ve developed a SoM that’s not much bigger than a US quarter which takes direct image inputs from 3 cameras (2x OV9282, 1x IMX378), processes it, and spits the result back to the host via USB3.1.

Our ultimate goal is to develop a rear-facing AI vision system that will alert cyclists of potential danger from distracted drivers, so we needed disparity + AI to get object localization outputs – an understanding of where and what objects are. This needs to happen fast and with as little latency as possible… and at the edge… and at low power!

There are some Myriad X solutions on the market already, but most use PCIe, so the data pipeline isn’t as direct as Sensor–>Myriad–>Host, and the existing solutions also don’t offer a three camera solution for RGBd. So, we built it!

If anyone has any questions or comments, we’d love to hear it!

Shameless plugs for our hackaday and crowdsupply 🙂

submitted by /u/Luxonis-Brian
[link] [comments]

Exploring images on social media using Amazon Rekognition and Amazon Athena

If you’re like most companies, you wish to better understand your customers and your brand image. You’d like to track the success of your marketing campaigns, and the topics of interest—or frustration—for your customers. Social media promises to be a rich source of this kind of information, and many companies are beginning to collect, aggregate, and analyze the information from platforms like Twitter.

However, more and more social media conversations center around images and video; on one recent project, approximately 30% of all tweets collected included one or more images. These images contain relevant information that is not readily accessible without analysis.

About this blog post
Time to complete 1 hour
Cost to complete ~ $5 (at publication time, depending on terms used)
Learning level Intermediate (200)
AWS services Amazon Rekognition
Amazon Athena
Amazon Kinesis Data Firehose
Amazon S3
AWS Lambda

Overview of solution

The following diagram shows the solution components and how the images and extracted data flows through them.

These components are available through an AWS CloudFormation template.

  1. Twitter Search API collects Tweets.
  2. Amazon Kinesis Data Firehose dispatches the tweets to store in an Amazon S3
  3. The creation of an S3 object in the designated bucket folder triggers a Lambda function.
  4. The Lambda sends each tweet text to Amazon Comprehend to detect sentiment (positive or negative), entity (real-world objects such as people, places, and commercial items), and to precise references to measures such as dates and quantities. For more information, see DetectSentiment and DetectEntity in the Amazon Comprehend Developer Guide.
  5. The Lambda checks each tweet for media of type ‘photo’ in the tweet’s extended_entities If the photo has either a .JPG or .PNG extension, the Lambda calls the following Amazon Rekognition APIs for each image:
    • Detect_labels, to identify objects such as Person, Pedestrian, Vehicle, and Car in the image.
    • Detect_moderation_labels, to determine if an image or stored video contains unsafe content, such as explicit adult content or violent content.
    • If the detect_labels API returns a Text label, detect_text extracts lines, words, or letters found in the image.
    • If the detect_labels API returns a Person label, the Lambda calls the following:
      • detect_faces, to detect faces and analyze them for features such as sunglasses, beards, and mustaches.
      • recognize_celebrities, to detect as many celebrities as possible in different settings, cosmetic makeup, and other conditions.

      The results from all calls for a single image are combined into a single JSON record. For more information about these APIs, see Actions in the Amazon Rekognition Developer Guide.

  6. The results of the Lambda go to Kinesis Data Firehose. Kinesis Data Firehose batches the records and writes them to a designated S3 bucket and folder.
  7. You can use Amazon Athena to build tables and views over the S3 datasets, then catalogue these definitions in the AWS Glue Data Catalog. The table and view definitions make it much easier to query the complex JSON objects contained in these S3 datasets.
  8. After the processed tweets land in S3, you can query the data with Athena.
  9. You can also use Amazon QuickSight to visualize the data, or Amazon SageMaker or Amazon EMR to process the data further. For more information, see Build a social media dashboard using machine learning and BI services. This post uses Athena.


This walkthrough has the following prerequisites:

  • An AWS account.
  • An app on Twitter. To create an app, see the Apps section of the Twitter Development website.
    • Create a consumer key (API key), consumer secret key (API secret), access token, and access token secret. The solution uses them as parameters in the AWS CloudFormation stack.


This post walks you through the following steps:

  • Launching the provided AWS CloudFormation template and collecting tweets.
  • Checking that the stack created datasets on S3.
  • Creating views over the datasets using Athena.
  • Exploring the data.

S3 stores the raw tweets and the Amazon Comprehend and Amazon Rekognition outputs in JSON format. You can use Athena table and view definitions to flatten the complex JSON produced and extract your desired fields. This approach makes the data easier to access and understand.

Launching the AWS CloudFormation template

This post provides an AWS CloudFormation template that creates all the ingestion components that appear in the previous diagram, except for the S3 notification for Lambda (the dotted blue line in the diagram).

  1. In the AWS Management Console, launch the AWS CloudFormation Template.

    This launches the AWS CloudFormation stack automatically into the us-east-1 Region.

  2. In the post Build a social media dashboard using machine learning and BI services, in the section “Build this architecture yourself,” follow the steps outlined, with the following changes:
    • Use the Launch Stack link from this post.
    • If the AWS Glue database socialanalyticsblog already exists (for example, if you completed the walkthrough from the previous post), change the name of the database when launching the AWS CloudFormation stack, and use the new database name for the rest of this solution.
    • For Twitter Languages, use ‘en’ (English) only. This post removed the Amazon Comprehend Translate capability for simplicity and to reduce cost.
    • Skip the section “Setting up S3 Notification – Call Amazon Translate/Comprehend from new Tweets.” This occurs automatically when launching the AWS CloudFormation stack by the “Add Trigger” Lambda function.
    • Stop at the section “Create the Athena Tables” and complete the following instructions in this post instead.

You can modify which terms to pull from the Twitter streaming API to be those relevant for your company and your customers. This post used several Amazon-related terms.

This implementation makes two Amazon Comprehend calls and up to five Amazon Rekognition calls per tweet. The cost of running this implementation is directly proportional to the number of tweets you collect. If you’d like to modify the terms to something that may retrieve tens or hundreds of tweets a second, for efficiency and for cost management, consider performing batch calls or using AWS Glue with triggers to perform batch processing versus stream processing.

Checking the S3 files

After the stack has been running for approximately five minutes, datasets start appearing in the S3 bucket (rTweetsBucket) that the AWS CloudFormation template created. Each dataset is represented as the following files sitting in a separate directory in S3:

  • Raw – The raw tweets as received from Twitter.
  • Media – The output from calling the Amazon Rekognition APIs.
  • Entities – The results of Amazon Comprehend entity analysis.
  • Sentiment – The results of Amazon Comprehend sentiment analysis.

See the following screenshot of the directory:

For the entity and sentiment tables, see Build a social media dashboard using machine learning and BI services.

When you have enough data to explore (which depends on how popular your selected terms are and how frequently they have images), you can stop the Twitter stream producer, and stop or terminate the Amazon EC2 instance. This stops your charges from Amazon Comprehend, Amazon Rekognition, and EC2.

Creating the Athena views

The next step is manually creating the Athena database and tables. For more information, see Getting Started in the Athena User Guide.

This is a great place to use AWS Glue crawling features in your data lake architectures. The crawlers automatically discover the data format and data types of your different datasets that live in S3 (as well as relational databases and data warehouses). For more information, see Defining Crawlers.

  1. In the Athena console, in Query Editor, access the file sql.The AWS CloudFormation stack created the database and tables for you automatically.
  2. Load the view create statements into the Athena query editor one by one, and execute.This step creates the views over the tables.

Compared to the prior post, the media_rekognition table and the views are new. The tweets table has a new extended_entities column for images and video metadata. The definitions of the other tables remain the same.

Your Athena database should look similar to the following screenshot. There are four tables, one for each of the datasets on S3. There are also three views, combining and exposing details from the media_rekognition table:

  • Celeb_view focuses on the results of the recognize_celebrities API
  • Media_image_labels_query focuses on the results from the detect_labels API
  • Media_image_labels_face_query focuses on the results from the detect_faces API

Explore the table and view definitions. The JSON objects are complex, and these definitions show a variety of uses for querying nested objects and arrays with complex types. Now many of the queries can be relatively simple, thanks to the underlying table and view definitions encapsulating the complexity of the underlying JSON. For more information, see Querying Arrays with Complex Types and Nested Structures.

Exploring the results

This section describes three use cases for this data and provides SQL to extract similar data. Because your search terms and timeframe are different from those in this post, your results will differ. This post used a set of Amazon-related terms. The tweet collector ran for approximately six weeks and collected approximately 9.5M tweets. From the tweets, there were approximately 0.5M photos, about 5% of the tweets. This number is low compared to some other sets of business-related search terms, where approximately 30% of tweets contained photos.

This post reviews for four image use cases:

  1. Buzz
  2. Labels and faces
  3. Suspect content
  4. Exploring celebrities


Major topic areas represented by the links associated with the tweets often provide a good complement to the tweet language content topics surfaced via natural language processing. For more information, see Build a social media dashboard using machine learning and BI services.

The first query is which websites the tweets linked to. The following code shows the top domain names linked from the tweets:

SELECT lower(url_extract_host(url.expanded_url)) AS domain,
         count(*) AS count
    (SELECT *
    FROM "tweets"
    CROSS JOIN UNNEST (entities.urls) t (url))

The following screenshot shows the top 10 domains returned:

Links to Amazon websites are frequent, and several different properties are named, such as,, and

Further exploration shows that many of these links are to product pages on the Amazon website. It’s easy to recognize these links because they have /dp/ (for detail page) in the link. You can get a list of those links, the images they contain, and the first line of text in the image (if there is any), with the following query:

SELECT tweetid,
         element_at(textdetections,1).detectedtext AS first_line,
    (SELECT id, AS user_name,
         url.expanded_url as expanded_url
    FROM tweets
    CROSS JOIN UNNEST (entities.urls) t (url)) tweet_urls
    (SELECT media_url,
         image_labels.textdetections AS textdetections
    FROM media_rekognition) rk
    ON rk.tweetid =
WHERE lower(url_extract_host(expanded_url)) IN ('', '', '', '')
        AND NOT position('/dp/' IN url_extract_path(expanded_url)) = 0 -- url links to a product

The following screenshot shows some of the records returned by this query. The first_line column shows the results returned by the detect_text API for the image URL in the media_url column.

Many of the images do contain text. You can also identify the products the tweet linked to; many of the tweets are product advertisements by sellers, using images that relate directly to their product.

Labels and faces

You can also get a sense of the visual content of the images by looking at the results of calling the Amazon Rekognition detect_labels API. The following query finds the most common objects found in the photos:

SELECT label_name,
         COUNT(*) AS count
FROM media_image_labels_query
GROUP BY  label_name
LIMIT 50; 

The following screenshot shows the results of that request. The most popular label by far is Human or Person, with Text, Advertisement, and Poster coming soon after. Novel is further down the list. This result reflects the most popular product being tweeted about on the Amazon website—books.

You can explore the faces further by looking at the results of the detect_faces API. That API returns details for each face in the image, including the gender, age range, face position, whether the person is wearing sunglasses or has a mustache, and the expression(s) on their face. Each of these features also has a confidence level associated with it. For more information, see DetectFaces in the Amazon Rekognition Developer Guide.

The view media_image_labels_face_query unnests many of these features from the complex JSON object returned by the API call, making the fields easy to access.

You can explore the view definition for media_image_labels_face_query, including the use of the reduce operator on the array of (emotion,confidence) pairs that Amazon Rekognition returned to identify and return the expression category with the highest confidence score associated with it, and associate the name top_emotion with it. See the following code:

reduce(facedetails.emotions, element_at(facedetails.emotions, 1), (s_emotion, emotion) -> IF((emotion.confidence > s_emotion.confidence), emotion, s_emotion), (s) -> s) top_emotion

You can then use the exposed field, top_emotion. See the following code:

SELECT top_emotion.type AS emotion ,
         top_emotion.confidence AS emotion_confidence ,
         milfq.* ,   
         "user".id AS user_id ,
         "user".screen_name ,
         "user".name AS user_name ,
        url.expanded_url AS url
FROM media_image_labels_face_query milfq
    ON = tweetid, UNNEST(entities.urls) t (url)
WHERE position('.amazon.' IN url.expanded_url) > 0;

The following screenshot shows columns from the middle of this extensive query, including glasses, age range, and where the edges of this face are positioned. This last detail is useful when multiple faces are present in a single image, to distinguish between the faces.

You can look at the top expressions found on these faces with the following code:

SELECT top_emotion.type AS emotion,
         COUNT(*) AS "count"
FROM media_image_labels_face_query milfq
WHERE top_emotion.confidence > 50
GROUP BY top_emotion.type
ORDER BY 2 desc; 

The following screenshot of the query results shows that CALM is the clear winner, followed by HAPPY. Oddly, there are far fewer confused than disgusted expressions.

Suspect content

A topic of frequent concern is whether there is content in the tweets, or the associated images, that should be moderated. One of the Amazon Rekognition APIs called by the Lambda for each image is moderation_labels, which returns labels denoting the category of content found, if any. For more information, see Detecting Unsafe Content.

The following code finds tweets with suspect images. Twitter also provides a possibly_sensitive flag based solely on the tweet text.

SELECT tweetid,
transform(image_labels.moderationlabels, ml -> AS moderationlabels, 
"mediaid", "media_url" , 
"url"."expanded_url" AS url , 
    (CASE WHEN ("substr"("tweets"."text", 1, 2) = 'RT') THEN
    ELSE false END) "isretweet"
FROM media_rekognition
    ON ("tweets"."id" = "tweetid"), UNNEST("entities"."urls") t (url)
WHERE cardinality(image_labels.moderationlabels) > 0
        OR possibly_sensitive = True;

The following screenshot shows the first few results. For many of these entries, the tweet text or the image may contain sensitive content, but not necessarily both. Including both criteria provides additional safety.

Note the use of the transform construct in the preceding query to map over the JSON array of moderation labels that Amazon Rekognition returned. This construct lets you transform the original content of the moderationlabels object (in the following array) into a list containing only the name field:

[{confidence=52.257442474365234, name=Revealing Clothes, parentname=Suggestive}, {confidence=52.257442474365234, name=Suggestive, parentname=}]  

You can filter this query to focus on specific types of unsafe content by filtering on specific moderation labels. For more information, see Detecting Unsafe Content.

A lot of these tweets have product links embedded in the URL. URLs for the website have a pattern to them: any URL with /dp/ in it is a link to a product page. You could use that to identify the products that may have explicit content associated with them.

Exploring celebrities

One of the Amazon Rekognition APIs that the Lambda called for each image was recognize_celebrity. For more information, see Recognizing Celebrities in an Image.

The following code helps determine which celebrities appear most frequently in the dataset:

SELECT name as celebrity,
         COUNT (*) as count
FROM celeb_view
GROUP BY  name
ORDER BY  COUNT (*) desc;

The result counts instances of celebrity recognitions, and counts an image with multiple celebrities multiple times.

For example, assume there is a celebrity with the label JohnDoe. To explore their images further, use the following query. This query finds the images associated with tweets in which JohnDoe appeared in the text or the image.

SELECT cv.media_url,
         COUNT (*) AS count ,
FROM celeb_view cv
LEFT JOIN      -- left join to catch cases with no text 
    (SELECT tweetid,
         textdetection.detectedtext AS detectedtext
    FROM media_rekognition , UNNEST(image_labels.textdetections) t (textdetection)
    WHERE (textdetection.type = 'LINE'
            AND = 0) -- get the first line of text
    ) mr
    ON ( cv.mediaid = mr.mediaid
        AND cv.tweetid = mr.tweetid )
WHERE ( ( NOT position('johndoe' IN lower(tweettext)) = 0 ) -- JohnDoe IN text
        OR ( (NOT position('johndoe' IN lower(name)) = 0) -- JohnDoe IN image
AND matchconfidence > 75) )  -- with pretty good confidence
GROUP BY  cv.media_url, detectedtext

The recognize_celebrity API matches each image to the closest-appearing celebrity. It returns that celebrity’s name and related information, along with a confidence score. At times, the result can be misleading; for example, if a face is turned away, or when the person is wearing sunglasses, they can be difficult to identify correctly. In other instances, the API may choose an image model because of their similarity to a celebrity. It may be beneficial to combine this query with logic using the face_details response, to check for glasses or for face position.

Cleaning up

To avoid incurring future charges, delete the AWS CloudFormation stack, and the contents of the S3 bucket created.


This post showed how to start exploring what your customers are saying about you on social media using images. The queries in this post are just the beginning of what’s possible. To better understand the totality of the conversations your customers are having, you can combine the capabilities from this post with the results of running natural language processing against the tweets.

This entire processing, analytics, and machine learning pipeline—starting with Kinesis Data Firehose, using Amazon Comprehend to perform sentiment analysis, Amazon Rekognition to analyze photographs, and Athena to query the data—is possible without spinning up any servers.

This post added advanced machine learning (ML) services to the Twitter collection pipeline, through some simple calls within Lambda. The solution also saved all the data to S3 and demonstrated how to query the complex JSON objects using some elegant SQL constructs. You could do further analytics on the data using Amazon EMR, Amazon SageMaker, Amazon ES, or other AWS services. You are limited only by your imagination.

About the authors

Dr. Veronika Megler is Principal Consultant, Data Science, Big Data & Analytics, for AWS Professional Services. She holds a PhD in Computer Science, with a focus on scientific data search. She specializes in technology adoption, helping companies use new technologies to solve new problems and to solve old problems more efficiently and effectively.





Chris Ghyzel is a Data Engineer for AWS Professional Services. Currently, he is working with customers to integrate machine learning solutions on AWS into their production pipelines. 





[P] Open source library to perform entity embeddings on categorical variables using Convolutional Neural Networks [+ Unit Tests, Code Coverage and Continuous Integration]

In the past 2 years I have been working as a Machine Learning developer, mostly with tabular data, and I’ve developed a tool to perform entity embeddings on categorical variables using CNN with Keras. I tried pretty much to make it easy to use and flexible to most of the existent scenarios (regression, binary and multi-class classification), but if you find any other need or issue to be fixed, do not hesitate to ask.

I tried to add some cool stuff on the project, such as unit tests, code coverage with Codacy, continuous integration with Travis CI and auto deployment to PyPi and auto-generated documentation with Sphinx and ReadTheDocs, so if any of you is interested in how to setup your project to have these features, feel free to use it as a base project.

Looking forward to any reviews about the source code. Any tip to improve the readability or even performance, its really welcome and well appreciated.



Code coverage (nowadays reaching 97%):

Thanks and I hope it can help somebody out there 🙂

submitted by /u/CrazyCapivara
[link] [comments]

[P] Cortex: Deploy models from any framework as production APIs

Cortex just released V 0.10, which includes their new Predictor Interface for serving models. It lets you take models from any framework and implement them in simple Python, before deploying them with a single terminal command. V 0.10 also still includes out-of-the-box support for TensorFlow Serving and ONNX Runtime.

Repo link:


Deploying Hugging Face’s DistilGPT-2 – PyTorch

Deploying a sentiment analyzer with BERT – TensorFlow

The classic iris classifier – XGBoost/ONNX

submitted by /u/killthecloud
[link] [comments]

[N] Pre-trained knowledge graph embedding models are available in GraphVite!

In the recent update of GraphVite, we release a new large-scale knowledge graph dataset, along with new benchmarks of knowledge graph embedding methods. The dataset, Wikidata5m, contains 5 million entities and 21 million facts constructed from Wikidata and Wikipedia. Most of the entities come from the general domain or the scientific domain, such as celebrities, events, concepts and things.

To facilitate the usage of knowledge graph representations in semantic tasks, we provide a bunch of pre-trained embeddings from popular models, including TransE, DistMult, ComplEx, SimplE and RotatE. You can directly access these embeddings by natural language index, such as “machine learning”, “united states” or even abbreviations like “m.i.t.”. Check out these models here.

Here are the benchmarks of these models on Wikidata5m.

TransE 109370 0.253 0.170 0.311 0.392
DistMult 211030 0.253 0.209 0.278 0.334
ComplEx 244540 0.281 0.228 0.310 0.373
SimplE 112754 0.296 0.252 0.317 0.377
RotatE 89459 0.290 0.234 0.322 0.390

submitted by /u/kiddozhu
[link] [comments]

Next Meetup




Plug yourself into AI and don't miss a beat


Toronto AI is a social and collaborative hub to unite AI innovators of Toronto and surrounding areas. We explore AI technologies in digital art and music, healthcare, marketing, fintech, vr, robotics and more. Toronto AI was founded by Dave MacDonald and Patrick O'Mara.