Author: torontoai

[D] I just found the earliest description of the GAN idea – in the context of genetic algorithms

Written on September 4, 2019. Posted in Reddit MachineLearning.

Danny Hillis “Co-evolving parasites improve simulated evolution as an optimization procedure “, 1990.

https://archive.org/details/06Kahle001316/page/n3

…there are two independent gene pools, each evolving according to the selection/mutation/recombination sequence outlined above. One population, the “hosts”, represents sorting networks, while the other population, the “parasites”, represents test cases. (These two populations might also be considered as “prey” and “predator”, since their evolution rates are comparable.) Both populations evolve on the same grid, and their interaction is through their fitness functions. The sorting networks are scored according to the test cases provided by the parasites at the same grid location. The parasites are scored according to how well they find flaws in sorting networks. Specifically, the phenotype of each parasite is a group of 10 to 20 test cases, and its score is the number of these tests that the corresponding sorting network fails to pass. The fitness functions of the host sorting networks and the parasitic sets of test patterns are complementary in the sense that a success of the sorting network represents a failure of the test pattern and vice versa.

submitted by /u/p1esk
[link] [comments]

Announcement of the 2019 Fellowship Awardees and Highlights from the Google PhD Fellowship Summit

Written on September 4, 2019. Posted in Google.

Posted by Susie Kim, Program Manager, University Relations

In 2009, Google created the PhD Fellowship Program to recognize and support outstanding graduate students who are doing exceptional research in Computer Science and related fields who seek to influence the future of technology. Now in its eleventh year, these Fellowships have helped support 450 graduate students globally in North America and Europe, Australia, Asia, Africa and India.

Every year, recipients of the Fellowship are invited to a global summit at our Mountain View campus, where they can learn more about Google’s state-of-the-art research, and network with Google’s research community as well as other PhD Fellows from around the world. Below we share some highlights from our most recent summit, and also announce the latest class of Google PhD Fellows.

Summit Highlights
At this year’s summit event, active Google Fellowship recipients were joined by special guests, FLIP (Diversifying Future Leadership in the Professoriate) Alliance Fellows. Research Director Peter Norvig opened the event with a keynote on the fundamental practice of machine learning, followed by a number of talks by prestigious researchers. Among the list of speakers were Research Scientist Peggy Chi, who spoke about crowdsourcing geographically diverse images for use in training data, Senior Google Fellow and SVP of Google Research and Health Jeff Dean, who discussed using deep learning to solve a variety of challenging research problems at Google, and Research Scientist Vinodkumar Prabhakaran, who presented the ethical implications of machine learning, especially around questions of fairness and accountability. See the complete list of insightful talks delivered by all speakers here.

Google and FLIP Alliance Fellows attending the 2019 PhD Fellowship Summit

Google Fellows had the opportunity to present their work in lightning talks to small groups with common research interests. In addition, Google and FLIP Alliance Fellows came together to share their work with Google researchers and each other during a poster session.

Poster session in full swing

2019 Google PhD Fellows
The Google PhD Fellows represent some of the best and brightest young computer science researchers from around the globe, and it is our ongoing goal to support them as they make their mark on the world. Congratulations to all of this year’s awardees! The complete list of recipients is:

Algorithms, Optimizations and Markets
Aidasadat Mousavifar, EPFL Ecole Polytechnique Fédérale de Lausanne
Peilin Zhong, Columbia University
Siddharth Bhandari, Tata Institute of Fundamental Research
Soheil Behnezhad, University of Maryland at College Park
Zhe Feng, Harvard University

Computational Neuroscience
Caroline Haimerl, New York University
Mai Gamal, Nile University

Human Computer Interaction
Catalin Voss, Stanford University
Hua Hua, Australian National University
Zhanna Sarsenbayeva, University of Melbourne

Machine Learning
Abdulsalam Ometere Latifat, African University of Science and Technology Abuja
Adji Bousso Dieng, Columbia University
Blake Woodworth, Toyota Technological Institute at Chicago
Diana Cai, Princeton University
Francesco Locatello, ETH Zurich
Ihsane Gryech, International University Of Rabat, Morocco
Jaemin Yoo, Seoul National University
Maruan Al-Shedivat, Carnegie Mellon University
Ousseynou Mbaye, Alioune Diop University of Bambey
Redani Mbuvha, University of Johannesburg
Shibani Santurkar, Massachusetts Institute of Technology
Takashi Ishida, University of Tokyo

Machine Perception, Speech Technology and Computer Vision
Anshul Mittal, IIT Delhi
Chenxi Liu, Johns Hopkins University
Kayode Kolawole Olaleye, Stellenbosch University
Ruohan Gao, The University of Texas at Austin
Tiancheng Sun, University of California San Diego
Xuanyi Dong, University of Technology Sydney
Yu Liu, Chinese University of Hong Kong
Zhi Tian, University of Adelaide

Mobile Computing
Naoki Kimura, University of Tokyo

Natural Language Processing
Abigail See, Stanford University
Ananya Sai B, IIT Madras
Byeongchang Kim, Seoul National University
Daniel Patrick Fried, UC Berkeley
Hao Peng, University of Washington
Reinald Kim Amplayo, University of Edinburgh
Sungjoon Park, Korea Advanced Institute of Science and Technology

Privacy and Security
Ajith Suresh, Indian Institute of Science
Itsaka Rakotonirina, Inria Nancy
Milad Nasr, University of Massachusetts Amherst
Sarah Ann Scheffler, Boston University

Programming Technology and Software Engineering
Caroline Lemieux, UC Berkeley
Conrad Watt, University of Cambridge
Umang Mathur, University of Illinois at Urbana-Champaign

Quantum Computing
Amy Greene, Massachusetts Institute of Technology
Leonard Wossnig, University College London
Yuan Su, University of Maryland at College Park

Structured Data and Database Management
Amir Gilad, Tel Aviv University
Nofar Carmeli, Technion
Zhuoyue Zhao, University of Utah

Systems and Networking
Chinmay Kulkarni, University of Utah
Nicolai Oswald, University of Edinburgh
Saksham Agarwal, Cornell University

[R] STEGASURAS: STEGanography via Arithmetic coding and Strong neURAl modelS

Written on September 4, 2019. Posted in Reddit MachineLearning.

Online demo

arXiv link

Code

We recently released our demo for our EMNLP paper “Neural Linguistic Steganography”, hiding secret messages in natural language via arithmetic coding and GPT-2. Arithmetic coding is a powerful entropy coding technique that is optimal for random sequences. Using arithmetic coding in reverse enables extremely efficient steganography, and when combined with modern language models like GPT-2 it allows for convincing cover text generations that encode information.

submitted by /u/kcazyz
[link] [comments]

The story behind the world’s first AI-created whisky

Written on September 4, 2019. Posted in Microsoft.

The post The story behind the world’s first AI-created whisky appeared first on The AI Blog.

How Microsoft got in on the creation of the world’s first whisky formulated with AI

Written on September 4, 2019. Posted in Microsoft.

The post How Microsoft got in on the creation of the world’s first whisky formulated with AI appeared first on The AI Blog.

Senior Broker / Sales Agent – Merchant Broker – Toronto, ON

Written on September 4, 2019. Posted in Toronto Job Postings.

Merchant Broker offers a state-of-the-art sales and CRM platform that is considered industry-leading in North America, and takes advantage of deep machine…
From Indeed – Thu, 05 Sep 2019 16:35:36 GMT – View all Toronto, ON jobs

[P] Comparing 11 Speech-to-Text models using Tensorflow

Written on September 4, 2019. Posted in Reddit MachineLearning.

Here I compare 11 Speech-to-Text models using Tensorflow, 100% jupyter notebook and simplify. Accuracy based on character position.

80% of the dataset to train, 20% of the dataset to test.

Tacotron, test accuracy 77.09%
BiRNN LSTM, test accuracy 84.66%
BiRNN Seq2Seq + Luong Attention + Cross Entropy, test accuracy 87.86%
BiRNN Seq2Seq + Bahdanau Attention + Cross Entropy, test accuracy 89.28%
BiRNN Seq2Seq + Bahdanau Attention + CTC, test accuracy 86.35%
BiRNN Seq2Seq + Luong Attention + CTC, test accuracy 80.30%
CNN RNN + Bahdanau Attention, test accuracy 80.23%
Dilated CNN RNN, test accuracy 31.60%
Wavenet, test accuracy 75.11%
Deep Speech 2, test accuracy 81.40%
Wav2Vec Transfer learning BiRNN LSTM, test accuracy 83.24%

Link to repository, https://github.com/huseinzol05/NLP-Models-Tensorflow#speech-to-text

Link to dataset, https://tspace.library.utoronto.ca/handle/1807/24487, also included a notebook how to download the dataset.

Discussion

Dataset is not that really big, only 286MB.
Transfer learning Wav2Vec accuracy is not that high, maybe need more dataset.
I use my own hyperparameters for Wav2Vec, use original hyperparameters caused my GPU sync problem, sequence is too long.
I need to use bigger dataset.

submitted by /u/huseinzol05
[link] [comments]

Tracking the throughput of your private labeling team through Amazon SageMaker Ground Truth

Written on September 4, 2019. Posted in Amazon.

Launched at AWS re:Invent 2018, Amazon SageMaker Ground Truth helps you quickly build highly accurate training datasets for your machine learning models. Amazon SageMaker Ground Truth offers easy access to public and private human labelers, and provides them with built-in workflows and interfaces for common labeling tasks. Additionally, Amazon SageMaker Ground Truth can lower your labeling costs by up to 70% using automatic labeling, which works by training Ground Truth from data labeled by humans so that the service learns to label data independently.

When using your own private workers to perform data labeling, you want to measure and track their throughput and efficiency. Amazon SageMaker Ground Truth now logs worker events (for example, when a labeler starts and submits a task) to Amazon CloudWatch. In addition, you can also use the built-in metrics feature of CloudWatch to measure and track throughput across a work team or for individual workers. In this blog post, we cover how to use the raw worker event logs and built-in metrics in your AWS account.

How to use worker activity logs

Once you set up a private team of workers and run a labeling job with Amazon SageMaker Ground Truth, worker activity logs are automatically emitted to CloudWatch. To learn how to set up a private team and kick off your first labeling job, reference this getting started blog post. Note: If you have previously created a private work team, you need to create a new private work team to set up the trust permissions between work teams and CloudWatch. Realize, you do not have to use that private work team, and this is simply a one-time setup step.

To view the logs, visit the CloudWatch console and click on Logs in the left-hand panel. Here, you should see a log group named /aws/sagemaker/groundtruth/WorkerActivity.

This Log Group contains logs for each task a worker accepts during an Amazon SageMaker Ground Truth labeling job, and we have included an example log below. You see the worker’s Amazon Cognito sub ID in the “cognito_sub_id” field. We will demonstrate how to tie this back to worker’s identity through Amazon Cognito. In addition, you see the Amazon Resource Name (ARN) for the Amazon SageMaker Ground Truth labeling job in the “workflow_arn”. This log also contains timestamps for when the worker begins the task (“task_accepted_time”) and when the worker either returns or submits the task (“task_returned_time” or “task_submitted_time”).

{ 
     "worker_id": "cd449a289e129409", 
     "cognito_user_pool_id": "us-east-2_IpicJXXXX", 
     "cognito_sub_id": "d6947aeb-0650-447a-ab5d-894db61017fd", 
     "task_accepted_time": "Wed Aug 14 16:00:59 UTC 2019", 
     "task_submitted_time": "Wed Aug 14 16:01:04 UTC 2019", 
     "task_returned_time": "", 
     "workteam_arn": "arn:aws:sagemaker:us-east-2:############:workteam/private-crowd/Sample-labeling-team",
     "labeling_job_arn": "arn:aws:sagemaker:us-east-2:############:labeling-job/metrics-demo",
     "work_requester_account_id": "############", 
     "job_reference_code": "############",
     "job_type": "Private", 
     "event_type": "TasksSubmitted", 
     "event_timestamp": "1565798464" 
}

Learn more about using CloudWatch Logs from the developer documentation.

How to use worker activity metrics

You can also use the CloudWatch metrics capability to generate your own interesting statistics or graphs about the throughput of your private workers. You can begin by navigating to the Metrics tab and then the AWS/SageMaker/Workteam namespace.

Say you want to find the average amount of time workers spent on tasks for a specific labeling job. You would select the LabelingJob, Workteam option.

From here, you can calculate your own statistics. In the example below, we calculate the average time spent per submitted task for a specific labeling job. There were 14 tasks submitted that took a total of 2.28 minutes or, on average, 9.78 seconds per task.

Learn more about using CloudWatch metrics from the developer documentation.

How to link Amazon Cognito sub ID to worker information

You can link the outputted Amazon Cognito sub ID to identifiable worker information, such as user name. To do so, you can write a quick script using the Amazon Cognito ListUsers API. Alternatively, you can use the Amazon Cognito console by following these steps:

Navigate to Manage User Pools in the AWS Region where you are running your labeling jobs.
Select the sagemaker-ground-userpool (if you integrated your own Amazon Cognito user pool with Amazon SageMaker Ground Truth, select that user pool).
From the left-hand panel, click Users and groups to see all of the users in your user pool.
Click on any users to see their respective sub ID.

Conclusion

In this post, I introduced how to measure and track the throughput of your private labeling team using CloudWatch Logs and metrics. In addition, I walked through how to link the outputted worker ID to identifiable worker information, such as a user name. Visit the AWS Management Console to get started.

As always, AWS welcomes feedback. Please submit comments or questions below.

About the Authors

Vikram Madan is the Product Manager for Amazon SageMaker Ground Truth. He focusing on delivering products that make it easier to build machine learning solutions. In his spare time, he enjoys running long distances and watching documentaries.

Pranav Sachdeva is a Software Development Engineer in AWS AI. He is passionate about building high performance distributed systems to solve real life problems. He is currently focused on innovating and building capabilities in the AWS AI ecosystem that allow customers to give AI the much needed human aspect.

How AI Can Protect the World’s Woods from Deforestation

Written on September 4, 2019. Posted in NVIDIA.

For weeks, the Amazon rainforest has been burning at a startling rate. Tens of thousands of fires have been recorded this year — largely started by humans clearing land for logging, ranching or mining.

Weak regulations and the insufficient levels of forest monitoring personnel around the globe are no match for an illegal timber market worth up to $152 billion. Around a fifth of global carbon dioxide emissions come from deforestation.

But AI can give officials ears all over the forest, listening for chainsaws and unauthorized vehicles — warning signs of illegal logging in progress. Outland Analytics, a member of the NVIDIA Inception virtual accelerator, has developed a tree-mounted device that uses audio recognition algorithms to detect these signals and alert forest rangers.

“We have a dire law enforcement shortage,” said Elliot Richards, 20-year-old CEO of the Philadelphia-based startup, which began as a high school engineering project and is now a six-person company. “It’s a lot of not being in the right place at the right time.”

For every 300,000 acres of land managed by the U.S. Forest Service — an area equivalent to nearly 500 square miles — there’s just one law enforcement officer patrolling for illicit activity. A network of warning systems could help understaffed forest monitoring agencies worldwide better track and prevent illicit logging before it’s too late.

The AI algorithms behind Outland Analytics’ system are trained using NVIDIA GPUs, including a V100 Tensor Core GPU in the IBM Cloud. The company is working with the New York State Department of Environmental Conservation for field testing and plans to launch a paid pilot program in the fall.

If a Tree Falls in the Forest, AI Will Hear It

outland analytics device mounted on a tree — AI Speaks for the Trees: Outland Analytics edge devices can be mounted to a tree to listen for chainsaws and unauthorized vehicles.

Not every high school project turns into a full-fledged startup. But that’s how Outland Analytics got going, inspired by Richards and co-founder Edward Buckler’s love of nature and interest in land management.

Now undergrads at Drexel University and Stony Brook University, respectively, the founders started working on the company three years ago with the goal of improving forest protection.

While some organizations use satellite imagery or trail cameras that might provide notifications to forest rangers, those methods typically don’t provide immediate results — and it’s near impossible to identify individuals from the footage. Low-latency AI models that analyze audio could shorten response times, giving rangers minute-to-minute visibility into large areas of forest.

Using the TensorFlow deep learning framework, the team trained their AI algorithms with around 100 hours of audio from field recordings and publicly available data.

“GPUs in the cloud are nice because they’re preconfigured for you,” said Buckler. “We were blown away by how easy it was to tell a V100 on IBM Cloud to train our model, come back a few hours later and it’s all good to go.”

Buckler and Richards built a cellular-connected edge device about the size of a small backpack, topped with a solar panel and antenna. Strapped to a tree, a single device can monitor up to 150 acres of forest, collecting sound signals and sending them to the cloud for analysis.

If the neural network detects a chainsaw or unauthorized vehicle, it’ll contact officials through an email to a dispatch center or a text message to an individual ranger. Authorities can then head to the scene to catch potential environmental crimes in progress.

The low-maintenance device can be mounted at any height on a tree and is charged by solar power — though it can last a few days without sun. It’s so far been tested in the Adirondack and Catskill mountain ranges.

“The forests have the odds against them for protection,” said Richards. “We want to bolster the presence of specialized police forces by enabling them to respond to in-progress crimes.”

The post How AI Can Protect the World’s Woods from Deforestation appeared first on The Official NVIDIA Blog.

[Discussion] Category-theoretic approach to machine learning

Written on September 4, 2019. Posted in Reddit MachineLearning.

I’d like to start a thread about a small surge of recent papers studying machine learning from the perspective of functional programming/category theory. Plenty of interesting things are happening that most people don’t seem to be aware of, unless they are in these tight circles!

Category theory is a very general and rigorous mathematical theory of compositionality that seems have become a powerful unifying force in all the mathematics and very recently sciences. Its main concerns are those alike to in deep learning: finding compositional structure in data, such that the created abstractions are non-leaky and as general as possible.

Alongside the many of the papers I linked below, the Symposium on Compositional Structures that’s happening this week has two talks on abstract mathematical generalizations of machine learning.

Note: unlike most ML papers which are focused on experiments, almost all of these are biased heavily on theory and disentangling of some of the existing structure, rather than providing new ad-hoc design choices in neural network architectures. They don’t have a SOTA result or any immediate benefit you can implement right now, but are more focused on the long term understanding of the relevant structures underlying neural networks.

I’ve compiled a list of these papers below. To me all these things are cool and I thought it might be useful for people to see these new approaches, as they might show us a shape of things to come.

Backprop As Functor

The simple essence of automatic differentiation

Lenses and Learners

Compositional Deep Learning

Generalized convolution and efficient language recognition

Towards formalizing and extending differential programming using tangent categories

Learning as change propagation with delta lenses

From open learners to open games

EDIT: Disclaimer: I am the author of the fourth paper “Compositional deep learning”

submitted by /u/totallynotAGI
[link] [comments]