Author: torontoai

Real-time music recommendations for new users with Amazon SageMaker

Written on November 20, 2019. Posted in Amazon.

This is a guest post from Matt Fielder and Jordan Rosenblum at iHeartRadio. In their own words, “iHeartRadio is a streaming audio service that reaches tens of millions of users every month and registers many tens of thousands more every day.”

Personalization is an important part of the user experience, and we aspire to give useful recommendations as early in the user lifecycle as possible. Music suggestions that are surfaced directly after registration let our users know that we can quickly adapt to their tastes and reduce the likelihood of churn. But how do we personalize content to a user that doesn’t yet have any listening history?

This post describes how we leverage the information a user provides at registration to create a personalized experience in real-time. While a new user does not have any listening history, they do typically select a handful of genre preferences and indicate some of their demographic information during the onboarding process. We first show an analysis of these attributes that reveals useful patterns we use for personalization. Next, we describe a model that uses this data to predict the best music for each new user. Finally, we demonstrate how we serve these predictions as recommendations in real-time using Amazon SageMaker immediately after registration, which leads to a significant improvement in user engagement in an A/B test.

New user listening patterns

Before building our model, we wanted to determine if there were any interesting patterns in the data that might indicate that there is something to learn.

Our first hypothesis was that users of different demographic backgrounds would tend to prefer different types of music. For example, perhaps a 50-year-old male is more likely to listen to Classic Rock than a 25-year-old female, all else being equal. If there is any truth to this on average, we may not need to wait for a user to accrue listening history in order to generate useful recommendations — we could simply use the genre preferences and demographic information the user provided at registration.

To perform the analysis, we focused on listening behavior two months after a user registered and compared it with the information given by the user during registration. This two-month gap ensures we focus on active users who have explored our offerings. We should have a pretty good idea of what the user likes by this point in time. It also ensures that most of the noise from initial onboarding and marketing has subsided.

The following diagram shows the timeline of a user’s listening behavior from onboarding until two months after registration.

We then compared distributions of listening across genres of our new male users vs. our new female users. The results confirm our hypothesis that there are patterns in music preferences that correlate with demographic information. For example, you’ll notice that Sports and News & Talk are more popular with males. Using this data is likely to improve our recommendations, especially for users that don’t yet have listening history.

The following graph summarizes user gender as it relates to preferred genres.

Our second hypothesis was that users with similar tastes might express what genres they’re looking for differently. Moreover, iHeartRadio might have a slightly different definition of a genre as compared to how our users perceive that genre. This indeed seemed to be the case for certain genres. For example, we noticed that many users told us they like R&B music when in fact they listened to what we classify internally as Hip Hop. This is more a function of genres being somewhat subjective, in which different users have different definitions for the same genre.

Predicting genres

Now that we had some initial analytical evidence that demographics and genre preferences are useful in predicting new user behavior, we set out to build and test a model. We hoped that a model could systematically learn how demographic background and genre preferences relate to listening behavior. If successful, we could use the model to surface the correct genre-based content when a new user onboards to our platform.

As in the analysis phase, we defined a successful prediction as the ability to surface content the user would have naturally engaged with two months after signing up. As a result, users that go into the training data for our model are active listeners that have had the time to explore the offerings in our app. Thus, the target variable is the top genre a user listens to two months after registration, and the features are the user’s demographic attributes and combination of genres selected during registration.

As in most modeling exercises, we started with the most basic modeling technique, which in this case was multi-label logistic regression. We analyzed a sampling of the feature coefficients from the trained model and their relationship with subsequent listening in the following heat map. The non-demographic model features are the multi-hot encoding of genres that the user selected during onboarding. The brighter the square (i.e. larger weight), the more correlated a model feature is with the genre the user listens to in the second month after registration.

Sure enough, we were able to identify some initial patterns. First, we found that on the whole, when a user selects only 1 genre, they end up listening to that genre. However, for users who select certain genres such as Kids & Family, Mix & Variety, or R&B, the trend is more muted. Second, it’s interesting to note that when looking at age, our model learns that younger users tend to prefer Top 40 & Pop and Alternative whereas older users prefer International, Jazz, News & Talk, Oldies, and Public Radio. Third, we were fascinated by the fact the model could learn that users who select classical music also tend to listen to World, Public Radio, and International genres.

Although useful to explore how our features relate to listening behavior, logistic regression has several drawbacks. Perhaps most importantly, it does not naturally handle the case in which users select more than one genre, because interactions in a linear model are implicitly additive. In other words, it can’t weigh the interactions across genre selections appropriately. For us, this is a major issue because users that do reveal their genre preferences typically select more than one; on average users select around four genres.

We explored a few more advanced techniques such as tree-based models and feed-forward neural networks that would make up for the shortcomings of logistic regression. We found that tree-based methods gave us the best results while also having limited complexity as compared to the neural networks we built. They also gave us meaningful lifts as compared to logistic regression and were less prone to overfitting the training set. In the end, we decided on using LightGBM given its speed, ability to prevent overfitting, and superior performance.

We were excited to see that the offline metrics of our model were significantly better than our simple baseline. The baseline recommendation for a user is the most popular genre that they selected, regardless of their demographic membership, which is how our live content carousels have worked in the app historically. We found that sending new users three genre-based model recommendations capture their actual preferred genre 77% of the time, based on historical offline data. This corresponds to a 15% lift as compared to the baseline.

Surfacing predictions in real-time

Now that we have a model that seems to work, how do we surface these predictions in real-time? Historically at iHeartRadio, most of our models had been trained and scored in batch (e.g. daily or weekly) using Airflow and served from a key-value database like Amazon DynamoDB. In this case, however, our new user recommendations only provide value if we score and serve them in real-time. Immediately after the user registers, we have to be ready to serve appropriate genre-based predictions to the user based on registration information that of course we don’t know in advance. If we wait until the next day to serve these recommendations, it’s too late. That’s where Amazon SageMaker comes in.

Amazon SageMaker allows us to host real-time model endpoints that can surface predictions for users immediately after registration. It also offers convenient model training functionality. It allows for a few options to deploy models, ranging from using an existing built-in algorithm container (such as random forest or XGBoost), using pre-built container images, extending a pre-built container image, or building a custom container image. We decided to go with the last option of packaging our own algorithm into a custom image. This gave us the most flexibility because, as of this writing, a built-in algorithm container for LightGBM does not exist. Therefore, we packaged our own custom scoring code and built a Docker image that was pushed to Amazon Elastic Container Registry (Amazon ECR) for use in model scoring.

We masked the Amazon SageMaker endpoint behind an Amazon API Gateway so external clients could ping it for recommendations, while leaving the Amazon SageMaker backend secure in a private network. The API Gateway passes the parameter values to an AWS Lambda function, which in turn parses the values and sends them to the Amazon SageMaker endpoint for a model response. Amazon SageMaker also allows for automatic scaling of model scoring instances based on the volume of traffic. All we need to define is the desired number of requests per second for each instance and a maximum number of instances to scale up to. This makes it easy to roll-out the use of our endpoint to any variety of use-cases throughout iHeartRadio. In the 10 days we ran the test, our endpoint had 0 invocation errors and an average model latency of around 5 milliseconds.

For more information about Amazon SageMaker, see Using Your Own Algorithms or Models with Amazon SageMaker, Amazon SageMaker Bring Your Own Algorithm Example, and Call an Amazon SageMaker model endpoint using Amazon API Gateway and AWS Lambda.

Online results

We showed above that our model performed well in offline tests, but we also had to put it to the test in our production app. We tested it by using our model hosted on Amazon SageMaker to recommend a relevant radio station to our new users in the form of an in-app-message directly after registration. We compared this model to business rules that would simply recommend the most popular radio station that was classified into one of the user-selected genres. We ran the A/B test for 10 days with an even split between the groups. The group of users hit with our model predictions had an 8.7% higher click-through rate to the radio station! And of the users who did click, radio listening time was just as strong.

The following diagram shows the real-time predictions result in an 8.7% lift in CTR over the baseline and an example of what the A/B testing groups would have looked like.

Next steps and future work

We’ve shown that new users respond to the relevant content served by our genre prediction model hosted on an Amazon SageMaker endpoint. In our initial proof-of-concept, we introduced the treatment to only a portion of our new registrants. Next steps include expanding this test to a larger user-base and surfacing these recommendations by default in our content carousels for new users with little to no listening history. We also hope to expand the use of these types of models and real-time predictions to other personalization use-cases, such as the ordering of various content carousels and tiles throughout our app. Lastly, we are continuing to explore technologies that allow for seamlessly serving model predictions in real-time including Amazon SageMaker as described in this post as well as others such as FastAPI.

Thanks go out to the Data Science and Data Engineering teams for their support throughout testing the Amazon SageMaker POC and helpful feedback on the post, especially Brett Vintch and Ravi Theja Mudunuru. This post is also available from iHeartMedia on Medium.

About the authors

Matt Fielder is the EVP Engineering at iHeartRadio

Jordan Rosenblum is a Senior Data Scientist at iHeartRadio Digital

[D] What APIs/Libraries are available for Offline Handwriting OCR?

Written on November 20, 2019. Posted in Reddit MachineLearning.

I want to implement offline handwriting ocr within an application. It should work on photographs of handwritings (=no scanned images). I tried tesseract and Google Cloud Vision. Both seem not to work well with handwritings.

Are there any handwriting specialized APIs/libraries with high accuracy? I would like to use something finished – the focus of the project should not be on building/training a model. Whats the state of the art in that specific area?

Ive searched around a little bit but couldnt find anything suitable.

Thankful for every hint I get.

submitted by /u/TheM0zart
[link] [comments]

Machine Learning/Artificial Intelligence Data Scientist Intern – AECOM – Mississauga, ON

Written on November 20, 2019. Posted in Toronto Job Postings.

+ Focus on machine learning, object detection and artificial intelligence. + Extensive knowledge in machine learning, artificial intelligence, and optimization…
From AECOM – Thu, 21 Nov 2019 18:52:59 GMT – View all Mississauga, ON jobs

[D] Why does hierarchical Bayesian regression work well on imbalanced data?

Written on November 20, 2019. Posted in Reddit MachineLearning.

I have a dataset of electrical outages and it is extremely imbalanced, <2% of all of the data are positive classes. I am using weather station data to try to predict the probability of an outage occurring near the weather stations.

When I try any other model I have to rebalance the data to get any good results. However I have recently tried hierarchical Bayesian logistic regression and it performs just fine without resampling. In my methodology every individual weather station has a unique intercept and coefficients, but they are each drawn from a parent distribution.

What I would like to discuss is why does the hierarchical approach perform so much better on the imbalanced dataset?

submitted by /u/paulie007
[link] [comments]

[D] Combining non-text features with text classifier

Written on November 20, 2019. Posted in Reddit MachineLearning.

Hi! So I’m building a classifier which primarily looks at text, but I also want to include other features, which are non-text, and I was wondering what is the best way to do it? I feel like just adding another dimension in the vector which represents the text might cause these features to get ‘lost’, but maybe that’s not true. Is ther there some sort of agreed upon way of including these additional non-text features in? By non-text I mean just information which is not part of the body of the text, like some other meta data.

Thanks!

submitted by /u/saint—-
[link] [comments]

[D] Must read papers on application of NNs to 3D data, most importantly point clouds

Written on November 20, 2019. Posted in Reddit MachineLearning.

Could you please list most important or interesting publicataions covering application of neural networks and deep learning to 3D data, especially point clouds? I’m mostly interested in application of NNs to tasks like classification/segmentation of 3D objects but any other references are higly appreciated.

Thanks in advance.

submitted by /u/Unpigged
[link] [comments]

Speaking the Same Language: How Oracle’s Conversational AI Serves Customers

Written on November 20, 2019. Posted in NVIDIA.

At Oracle, customer service chatbots use conversational AI to respond to consumers with more speed and complexity.

Suhas Uliyar, vice president for product management for digital assistance and AI at Oracle, stopped by to talk to AI Podcast host Noah Kravitz about how the newest wave of conversational AI can keep up with the nuances of human conversation.

Many chatbots frustrate consumers because of their static nature. Asking a question or using the wrong keyword confuses the bot and prompts it to start over or make the wrong selection.

Uliyar says that Oracle’s digital assistant uses a sequence-to-sequence algorithm to understand the intricacies of human speech, and react to unexpected responses.

Their chatbots can “switch the context, keep the memory, give you the response and then you can carry on with the conversation that you had. That makes it natural, because we as humans fire off on different tangents at any given moment.”

Key Points From This Episode:

The contextual questions that often occur in normal conversation stump single-intent systems, but the most recent iteration is capable of answering simple questions quickly and remembering customers.
The next stage in conversational AI, Uliyar believes, will allow bots to learn about users in order to give them recommendations or take action for them.
Learn more about Oracle’s digital assistant for enterprise applications and visit Uliyar’s Twitter.

Tweetable

“If machine learning is the rocket that’s going to take us to the next level, then data is the rocket fuel.” — Suhas Uliyar [15:59]

Charter Boosts Customer Service with AI

Jared Ritter, the senior director of wireless engineering at Charter Communications, describes their innovative approach to data collection on customer feedback. Rather than retroactively accessing the data to fix problems, Charter uses AI to evaluate data constantly to predict issues and address them as early as possible.

Using Deep Learning to Improve the Hands-Free, Voice Experience

What would the future of intelligent devices look like if we could bounce from using Amazon’s Alexa to order a new book to Google Assistant to schedule our next appointment, all in one conversation? Xuchen Yao, the founder of AI startup KITT.AI, discusses the toolkit that his company has created to achieve a “hands-free” experience.

AI-Based Virtualitics Demystifies Data Science with VR

Aakash Indurkha, head of machine learning projects at AI-based analytics platform Virtualitics, explains how the company is bringing creativity to data science using immersive visualization. Their software bridges the gap created by a lack of formal training to help inexperienced users identify anomalies on their own, and gives experts the technology to demonstrate their complex calculations.

Make Our Podcast Better

Have a few minutes to spare? Fill out this short listener survey. Your answers will help us make a better podcast.

The post Speaking the Same Language: How Oracle’s Conversational AI Serves Customers appeared first on The Official NVIDIA Blog.

[R] Video Analysis: MuZero – Mastering Atari, Go, Chess and Shogi by Planning with a Learned Model

Written on November 20, 2019. Posted in Reddit MachineLearning.

https://youtu.be/We20YSAJZSE

MuZero harnesses the power of AlphaZero, but without relying on an accurate environment model. This opens up planning-based reinforcement learning to entirely new domains, where such environment models aren’t available. The difference to previous work is that, instead of learning a model predicting future observations, MuZero predicts the future observations’ latent representations, and thus learns to only represent things that matter to the task!

Abstract:

Constructing agents with planning capabilities has long been one of the main challenges in the pursuit of artificial intelligence. Tree-based planning methods have enjoyed huge success in challenging domains, such as chess and Go, where a perfect simulator is available. However, in real-world problems the dynamics governing the environment are often complex and unknown. In this work we present the MuZero algorithm which, by combining a tree-based search with a learned model, achieves superhuman performance in a range of challenging and visually complex domains, without any knowledge of their underlying dynamics. MuZero learns a model that, when applied iteratively, predicts the quantities most directly relevant to planning: the reward, the action-selection policy, and the value function. When evaluated on 57 different Atari games – the canonical video game environment for testing AI techniques, in which model-based planning approaches have historically struggled – our new algorithm achieved a new state of the art. When evaluated on Go, chess and shogi, without any knowledge of the game rules, MuZero matched the superhuman performance of the AlphaZero algorithm that was supplied with the game rules.

Authors: Julian Schrittwieser, Ioannis Antonoglou, Thomas Hubert, Karen Simonyan, Laurent Sifre, Simon Schmitt, Arthur Guez, Edward Lockhart, Demis Hassabis, Thore Graepel, Timothy Lillicrap, David Silver

submitted by /u/ykilcher
[link] [comments]

[R] Scalable graph machine learning: a mountain we can climb?

Written on November 20, 2019. Posted in Reddit MachineLearning.

Graph machine learning is still a relatively new and developing area of research and brings with it a bucket load of complexities and challenges. One such challenge that both fascinates and infuriates those of us working with graph algorithms is — scalability.

I learned first-hand that when trying to apply graph machine learning techniques to identify fraudulent behaviour in the bitcoin blockchain data, scalability was the biggest roadblock. The bitcoin blockchain graph I used has millions of wallets (nodes) and billions of transactions (edges) which makes most graph machine learning methods infeasible.

An algorithm called GraphSAGE (based on the method of neighbour-sampling) offered some solid breakthroughs, but there are still mountains to climb to make scalable graph machine learning more practical.

https://medium.com/stellargraph/scalable-graph-machine-learning-a-mountain-we-can-climb-753dccc572f

submitted by /u/StellarGraphLibrary
[link] [comments]

[D] Does EfficientNet really help in real projects ?

Written on November 20, 2019. Posted in Reddit MachineLearning.

There are large amount of papers which show that EfficientNet improves some CV tasks e.g. EfficientDet: Scalable and Efficient Object Detection.

But does it help much in real projects ? Do you guys have any experience with that ?

One more thing – ImageNet or COCO datasets are far away from what we have to deal with in real projects. Usually we have only small amount of images/classes, so improvements for COCO/ImageNet != improvements for real projects. What do you think ?

submitted by /u/___mlm___
[link] [comments]

Blog

Learn About Our Meetup

5000+ Members

MEETUPS

JOB POSTINGS

CONTACT

Author: torontoai

Real-time music recommendations for new users with Amazon SageMaker

New user listening patterns

Predicting genres

Surfacing predictions in real-time

Online results

Next steps and future work

About the authors

[D] What APIs/Libraries are available for Offline Handwriting OCR?

Machine Learning/Artificial Intelligence Data Scientist Intern – AECOM – Mississauga, ON

[D] Why does hierarchical Bayesian regression work well on imbalanced data?

[D] Combining non-text features with text classifier

[D] Must read papers on application of NNs to 3D data, most importantly point clouds

Speaking the Same Language: How Oracle’s Conversational AI Serves Customers

Key Points From This Episode:

Tweetable

You Might Also Like

Charter Boosts Customer Service with AI

Using Deep Learning to Improve the Hands-Free, Voice Experience

AI-Based Virtualitics Demystifies Data Science with VR

Make Our Podcast Better

[R] Video Analysis: MuZero – Mastering Atari, Go, Chess and Shogi by Planning with a Learned Model

[R] Scalable graph machine learning: a mountain we can climb?

[D] Does EfficientNet really help in real projects ?