Cloud Solution Architect (Identity) – Microsoft – Toronto, ON
From Microsoft – Tue, 24 Dec 2019 20:32:10 GMT – View all Toronto, ON jobs
Spell, founded by Serkan Piantino, is making machine learning as easy as ABC.
Piantino, CEO of the New York-based startup and former director of engineering for Facebook AI Research, explained to AI Podcast host Noah Kravitz how he’s bringing compute power to those that don’t have easy access to GPU clusters.
Spell provides access to hardware as well as a software interface that accelerates execution. Piantino reported that a wide variety of industries has shown interest in Spell, from healthcare to retail, as well as researchers and academia.
“You know there’s some upfront cost to running an experiment, but if you get that cost down low enough, it disappears mentally” — Serkan Piantino [11:52]
“Providing access to hardware and making things easier — giving everybody the same sort of beautiful compute cluster that giant research organizations work on — was a really powerful idea” — Serkan Piantino [18:36]
Deep learning icon and NVIDIA Chief Scientist Bill Dally reflects on his career in AI and offers insight into the AI revolution made possible by GPU-driven deep learning. He shares his predictions on where AI is going next: more powerful algorithms for inference, and neutral networks that can train on less data.
Across industries, employees spend valuable time processing mountains of paperwork. Evolution AI, a U.K. startup and NVIDIA Inception member, has developed an AI platform that extracts and understands information rapidly. Evolution AI Chief Scientist Martin Goodson explains the variety of problems that the company can solve.
Health insurance company Anthem helps patients personalize and better understand their healthcare information through AI. Rajeev Ronanki, senior vice president and chief digital officer at Anthem, explains how the company gives users the opportunity to schedule video consultations and book doctor’s appointments virtually.
Get the AI Podcast through iTunes, Google Podcasts, Google Play, Castbox, DoggCatcher, Overcast, PlayerFM, Pocket Casts, Podbay, PodBean, PodCruncher, PodKicker, Soundcloud, Spotify, Stitcher and TuneIn.
Have a few minutes to spare? Fill out this short listener survey. Your answers will help us make a better podcast.
The post Saved by the Spell: Serkan Piantino’s Company Makes AI for Everyone appeared first on The Official NVIDIA Blog.
Amazon Comprehend is a natural language processing (NLP) service that uses machine learning (ML) to find insights and relationships in texts. Amazon Comprehend identifies the language of the text; extracts key phrases, places, people, brands, or events; and understands how positive or negative the text is. For more information about everything Amazon Comprehend can do, see Amazon Comprehend Features.
You may need out-of-the-box NLP capabilities tied to your needs without having to lead a research phase. This would allow you to recognize entity types and perform document classifications that are unique to your business, such as recognizing industry-specific terms and triaging customer feedback into different categories.
Amazon Comprehend is a perfect match for these use cases. In November 2018, Amazon Comprehend added the ability for you to train it to recognize custom entities and perform custom classification. For more information, see Build Your Own Natural Language Models on AWS (no ML experience required).
This post demonstrates how to build a custom text classifier that can assign a specific label to a given text. No prior ML knowledge is required.
| About this blog post | |
| Time to complete | 1 hour for the reduced dataset ; 2 hours for the full dataset |
| Cost to complete | ~ $50 for the reduced dataset ; ~ $150 for the full dataset These include training, inference and model management, see Amazon Comprehend pricing for more details. |
| Learning level | Advanced (300) |
| AWS services | Amazon Comprehend Amazon S3 AWS Cloud9 |
To complete this walkthrough, you need an AWS account and access to create resources in AWS IAM, Amazon S3, Amazon Comprehend, and AWS Cloud9 within that account.
This post uses the Yahoo answers corpus cited in the paper Text Understanding from Scratch by Xiang Zhang and Yann LeCun. This dataset is available on the AWS Open Data Registry.
You can also use your own dataset. It is recommended that you train your model with up to 1,000 training documents for each label, and that when you select your labels, suggest labels that are clear and don’t overlap in meaning. For more information, see Training a Custom Classifier.
The walkthrough includes the following steps:
For more information about how to build a custom entity recognizer to extract information such as people and organization names, locations, time expressions, numerical values from a document, see Build a custom entity recognizer using Amazon Comprehend.
In this post, you use the AWS CLI as much as possible to speed up the experiment.
AWS Cloud9 is a cloud-based integrated development environment (IDE) that lets you write, run, and debug your code with a browser. It includes a code editor, debugger, and terminal. AWS Cloud9 comes pre-packaged with essential tools for popular programming languages and the AWS CLI pre-installed, so you don’t need to install files or configure your laptop for this workshop.
Your AWS Cloud9 environment has access to the same AWS resources as the user with which you logged in to the AWS Management Console.
To prepare your environment, complete the following steps:
CustomClassifier.It can take up to a few minutes for your environment to be provisioned and prepared. When the environment is ready, your IDE opens to a welcome screen, which contains a terminal prompt.
You can run AWS CLI commands in this prompt the same as you would on your local computer.
You get the following output which indicates your account and user information:
Keep your AWS Cloud9 IDE opened in a tab throughout this walkthrough.
Use the account ID from the previous step to create a globally unique bucket name, such as 123456789012-customclassifier. Enter the following command in your AWS Cloud9 terminal prompt:
The output shows the name of the bucket you created:
To authorize Amazon Comprehend to perform bucket reads and writes during the training or during the inference, you must grant Amazon Comprehend access to the S3 bucket that you created. You are creating a data access role in your account to trust the Amazon Comprehend service principal.
To set up IAM, complete the following steps:
The Policy named ComprehendDataAccessRolePolicy is automatically attached.
ComprehendBucketAccessRole.You use this ARN when you launch the training of your custom classifier.
In this step, you download the corpus and prepare the data to match Amazon Comprehend’s expected formats for both training and inference. This post provides a script to help you achieve the data preparation for your dataset.
Alternatively, for even more convenience, you can download the prepared data by entering the following two command lines:
If you follow the preceding step, skip the next steps and go directly to the upload part at the end of this section.
If you want to go through the dataset preparation for this walkthrough, or if you are using your own data follow the next steps:
Enter the following command in your AWS Cloud9 terminal prompt to download it from the AWS Open Data registry:
You see a progress bar and then the following output:
Uncompress it with the following command:
You should delete the archive because you are limited in available space in your AWS Cloud9 environment. Use the following command:
You get a folder yahoo_answers_csv, which contains the following four files:
The files train.csv and test.csv contain the training samples as comma-separated values. There are four columns in them, corresponding to class index (1 to 10), question title, question content, and best answer. The text fields are escaped using double quotes (“), and any internal double quote is escaped by two double quotes (“”). New lines are escaped by a backslash followed with an “n” character, that is “n”.
The following code is the overview of file content:
The file classes.txt contains the available labels.
The train.csv file contains 1,400,000 lines and test.csv contains 60,000 lines. Amazon Comprehend uses between 10–20% of the documents submitted for training to test the custom classifier.
The following command indicates that the data is evenly distributed:
You should train your model with up to 1,000 training documents for each label and no more than 1,000,000 documents.
With 20% of 1,000,000 used for testing, that is still plenty of data to train your custom classifier.
Use a shortened version of train.csv to train your custom Amazon Comprehend model, and use test.csv to perform your validation and see how well your custom model performs.
For training, the file format must conform to the following requirements:
Labels must be uppercase, can be multi-token, have white space, consist of multiple words connected by underscores or hyphens, or may even contain a comma, as long as it is correctly escaped.
The following table contains the formatted labels proposed for the training.
| Index | Original | For training |
| 1 | Society & Culture | SOCIETY_AND_CULTURE |
| 2 | Science & Mathematics | SCIENCE_AND_MATHEMATICS |
| 3 | Health | HEALTH |
| 4 | Education & Reference | EDUCATION_AND_REFERENCE |
| 5 | Computers & Internet | COMPUTERS_AND_INTERNET |
| 6 | Sports | SPORTS |
| 7 | Business & Finance | BUSINESS_AND_FINANCE |
| 8 | Entertainment & Music | ENTERTAINMENT_AND_MUSIC |
| 9 | Family & Relationships | FAMILY_AND_RELATIONSHIPS |
| 10 | Politics & Government | POLITICS_AND_GOVERNMENT |
When you want your custom Amazon Comprehend model to determine which label corresponds to a given text in an asynchronous way, the file format must conform to the following requirements:
This post includes a script to speed up the data preparation. Enter the following command to copy the script to your local AWS Cloud9 environment:
To launch data preparation, enter the following commands:
This script is tied to the Yahoo corpus and uses the pandas library to format the training and testing datasets to match your Amazon Comprehend expectations. You may adapt it to your own dataset or change the number of items in the training dataset and validation dataset.
When the script is finished (it should run for approximately 11 minutes on a t2.large instance for the full dataset, and in under 5 minutes for the reduced dataset), you have the following new files in your environment:
Upload the prepared data (either the one you downloaded or the one you prepared) to the bucket you created with the following commands:
You are ready to launch the custom text classifier training. Enter the following command, and replace the role ARN and bucket name with your own:
You get the following output that names the custom classifier ARN:
It is an asynchronous call. You can then track the training progress with the following command:
You get the following output:
When the training is finished, you get the following output:
The training duration may vary; in this case, the training took approximately one hour for the full dataset (20 minutes for the reduced dataset).
The output for the training on the full dataset shows that your model has a recall of 0.72—in other words, it correctly identifies 72% of given labels.
The following screenshot shows the view from the console (Comprehend > Custom Classification > yahoo-answers).

You can now launch an inference job to test how the classifier performs. Enter the following commands:
You get the following output:
Just as you did for the training progress tracking, because the inference is asynchronous, you can check the progress of the newly launched job with the following command:
You get the following output:
When it is completed, JobStatus changes to COMPLETED. This takes approximately a few minutes to complete.
Download the results using OutputDataConfig.S3Uri path with the following command:
When you uncompress the output (tar xvzf output.tar.gz), you get a .jsonl file. Each line represents the result of the requested classification for the corresponding line of the document you submitted.
For example, the following code is one line from the predictions:
This means that your custom model predicted with a 96.8% confidence score that the following text was related to the “Entertainment and music” label.
Each line of results also provides the second and third possible labels. You might use these different scores to build your application upon applying each label with a score superior to 40% or changing the model if no single score is above 70%.
With the full dataset for training and validation, in less than two hours, you used Amazon Comprehend to learn 10 custom categories—and achieved a 72% recall on the test—and to apply that custom model to 60,000 documents.
Try custom categories now from the Amazon Comprehend console. For more information, see Custom Classification. You can discover other Amazon Comprehend features and get inspiration from other AWS blog posts about how to use Amazon Comprehend beyond classification.
Amazon Comprehend can help you power your application with NLP capabilities in almost no time. Happy experimentation!
Hervé Nivon is a Solutions Architect who helps startup customers grow their business on AWS. Before joining AWS, Hervé was the CTO of a company generating business insights for enterprises from commercial unmanned aerial vehicle imagery. Hervé has also served as a consultant at Accenture.
I always find myself wanting to make the decoder side of an autoencoder as symmetric as possible with respect to the encoder side, because it feels like an “elegant” design decision. But I suspect that it’s not optimal. And I’m not finding any direct discussions of this topic via google.
In most of mathematics, complex functions tend to have even more complex inverses. With respect to CNNs, convolutions are not strictly invertible, so it seems like the Conv2DTranspose operations could benefit from a higher complexity and parameter count to approximate it better. I’m curious if anyone has direct experience studying this, or if there are conventions for “optimizing” the decoder side of an autoencoder (or maybe it’s the encoder side needs more parameters…?).
My first inclination is to just double some numbers on the decoder side to give it twice as many parameters. But maybe including extra layers is better, since it more significantly increases the complexity of functions it can approximate. Or maybe none of this is theoretically necessary/relevant…?
Here’s an almost perfectly-symmetric reference network. Obviously I could experiment with it to come up with ideas, but I’m more interested in the general theory and if there’s any established ideas on the topic (and not just for CNNs, but all types of autoencoders).
_________________________________________________________________ Layer (type) Output Shape Param # ================================================================= input_1 (InputLayer) [(None, 48, 48, 3)] 0 _________________________________________________________________ conv2d_1 (Conv2D) (None, 24, 24, 32) 2432 _________________________________________________________________ conv2d_2 (Conv2D) (None, 12, 12, 64) 32832 _________________________________________________________________ conv2d_3 (Conv2D) (None, 6, 6, 128) 73856 _________________________________________________________________ flatten_3 (Flatten) (None, 4608) 0 _________________________________________________________________ dense_1 (Dense) (None, 256) 1179904 _________________________________________________________________ dense_2 (Dense) (None, 64) 16448 ================================================================= Total params: 1,305,472
_________________________________________________________________ Layer (type) Output Shape Param # ================================================================= input_2 (InputLayer) [(None, 64)] 0 _________________________________________________________________ dense_3 (Dense) (None, 256) 16640 _________________________________________________________________ dense_4 (Dense) (None, 4608) 1184256 _________________________________________________________________ reshape_1 (Reshape) (None, 6, 6, 128) 0 _________________________________________________________________ conv2d_transpose_3 (Conv2DTr (None, 12, 12, 64) 73792 _________________________________________________________________ conv2d_transpose_4 (Conv2DTr (None, 24, 24, 32) 32800 _________________________________________________________________ conv2d_transpose_5 (Conv2DTr (None, 48, 48, 3) 2403 ================================================================= Total params: 1,309,891
For reference, the above computation graph was produced with the following code fragment:
# Encoder enc_input = L.Input(shape=(48, 48, 3)) enc0 = L.Conv2D(filters= 32, kernel_size=5, strides=2, padding='same', activation='relu')(enc_input) enc1 = L.Conv2D(filters= 64, kernel_size=4, strides=2, padding='same', activation='relu')(enc0) enc2 = L.Conv2D(filters=128, kernel_size=3, strides=2, padding='same', activation='relu')(enc1) enc_flat = L.Flatten()(enc2) enc_dense = L.Dense(256, activation='tanh')(enc_flat) enc_out = L.Dense(64, activation='linear')(enc_dense) encoder = keras.Model(inputs=enc_input, outputs=enc_out, name='Encoder') # Decoder dec_input = L.Input(shape=(64,)) dec_dense1 = L.Dense(256, activation='tanh')(dec_input) dec_dense2 = L.Dense(6*6*128, activation='relu')(dec_dense1) dec_reshape = L.Reshape((6,6,128))(dec_dense2) dec2 = L.Conv2DTranspose(filters=64, kernel_size=3, strides=2, padding='same', activation='relu')(dec_reshape) dec1 = L.Conv2DTranspose(filters=32, kernel_size=4, strides=2, padding='same', activation='relu')(dec2) dec0 = L.Conv2DTranspose(filters= 3, kernel_size=5, strides=2, padding='same', activation='linear')(dec1) decoder = keras.Model(inputs=dec_input, outputs=dec0, name='Decoder') encoder.summary() decoder.summary()
submitted by /u/etotheipi_
[link] [comments]
| |
I have a dataset with 4 categorical features (Cholesterol, Systolic Blood pressure, diastolic blood pressure, and smoking rate). I use a decision tree classifier to find the probability of stroke. I am trying to verify my understanding of the splitting procedure done by Python Sklearn. Since it is a binary tree, there are three possible ways to split the first feature which is either to group categories {0 and 1 to a leaf, 2 to another leaf} or {0 and 2, 1}, or {0, 1 and 2}. What I know (please correct me here) is that the chosen split is the one with the highest information gain. I have calculated the information gain for each of the three grouping scenarios: {0 + 1 , 2} –> 0.17 {0 + 2 , 1} –> 0.18 {1 + 2 , 0} –> 0.004 However, sklearn’s decision tree chose the first scenario instead of the third (please check the picture). Can anyone please help clarify the reason for selecting the first scenario? is there a priority for splits that results in pure nodes. thus selecting such a scenario although it has less information gain? submitted by /u/elmsha |
I would like to ask your feedback and thoughts of the community here about AI Debate 2019: Yoshua Bengio vs Gary Marcus, what do you think about this debate?
submitted by /u/meldiwin
[link] [comments]
Hi guys,
I am new to machine learning and after trying out TensorFlow’s tutorial on how to create a classifier based on IMDb reviews, I want to create my own classifier to actually do a binary classification(malicious/benign) of maybe .exe or .apk files.
I was wondering if I can actually proceed to do the same thing as what tensorflow’s IMDb tutorial did, i.e train using a set of text + give those text a label (pos/neg).
So in the context of classifying malware, those texts are actually system API calls. i.e
Set 1 [ func1() func2() func3() func4() func5() func6()…etc] Label -> Malicious
Set 2 [func1() func3() func4() func5()] Label -> benign
Sequence of the API call matters btw and i heard to do that I will need to use RNN LSTM.
I would love to hear from you guys if this is the correct way to do things…would most likely target Android applications…
submitted by /u/yourspeaker317
[link] [comments]
Hi!
You might remember me from the blog I posted a few days ago link.
I received an absolute onslaught of emails (close to 30 emails!!!). The main question I got was “Wow computational narratology seems pretty cool! Where do I get started? I’ve only seen paper XYZ”
As such, I decided rather than answering ever email independently, I would create a curated list of papers!
https://github.com/LouisCastricato/Narratology-Papers
Feel free to contribute (PRs are welcome!) I’ll be working on this for the next few hours, so it should be a couple dozen papers by tomorrow 🙂
submitted by /u/FerretDude
[link] [comments]
This is fucking sick..
People based in India, the Philippines, and other countries that do not have the resources to go after Siraj legally are those who need the money the most. 200$ could be a months worth of salary, or several months. And the types of people who get caught up in the scams are those who genuinely looking to improve their financial situation and work hard for it. This is fucking cruel.
I’m having a hard time believing Siraj’s followers are that brainwashed. Most likely alt accounts controlled by Siraj.
submitted by /u/RelevantMarketing
[link] [comments]