Author: torontoai

[D] arxiv.org is down? use archive.org!

Written on November 6, 2019. Posted in Reddit MachineLearning.

For anyone (e.g. my roommate) who needs to access arXiv papers but can’t because the website is down, you can use the Wayback Machine to get copies of papers that were scraped in the past. For example:

https://web.archive.org/web/20190621093536/https://arxiv.org/pdf/1603.02754.pdf

I hope this is helpful for someone!

submitted by /u/InfiniteChaoxys
[link] [comments]

Highlights from the 2019 Google AI Residency Program

Written on November 6, 2019. Posted in Google.

Posted by Katie Meckley, Program Manager, Google AI Residency

This fall marks the successful conclusion to the fourth year of the Google AI Residency Program. Started in 2016 with 27 individuals in Mountain View, CA, the 12-month program has grown to nearly 100 residents from nine locations across the globe. Program participants have gone on to great success in PhD programs, academia, non-profits, and industry. Many have also become full-time Google researchers.

The program’s latest installment was our most successful yet, as residents advanced progress in a broad range of research fields, such as machine perception, algorithms and optimization, language understanding, healthcare and many more. Below are a handful of innovative projects from some of this year’s alumni.

A large-scale study on cross-lingual transfer in massive multilingual neural machine translation models (recently highlighted as part of this post), trained on billions of sentence pairs from more than 100 languages in order to significantly improve translation for both low- and high-resource languages.

Visualization of the clustering of encoder representations of all modeled languages, based on representational similarity. Encoder representations of different languages cluster according to linguistic similarity. Languages are color-coded by their linguistic family.

A generative model for Scalable Vector Graphics (SVGs), which can be used to aid designers in generating fonts.

Top: Unlike pixel representations of icons (right), in this case a “6”, SVGs (left; middle) are scale-invariant representations. Bottom: By modelling SVGs directly, we can aid artists in quickly and intuitively iterating over typography designs.

A method to learn GANs using discrepancy divergence, a measure that accounts for both the loss function and hypothesis set to provide theoretical learning guarantees.

As more generators are added to the DGAN ensemble more modes in the real distribution are covered. From left to right: 1 generator, 5 generators, and 10 generators.

A likelihood ratio method for deep generative models that effectively corrects for confounding background statistics to improve out-of-distribution (OOD) detection, and a new benchmark dataset for OOD detection in genomics.

Log-likelihood (left) and log likelihood-ratio (right) of each pixel for Fashion-MNIST. The likelihood is dominated by the “background” pixels, whereas the likelihood ratio focuses on the “semantic” pixels and is thus better for OOD detection.

A study showing when label smoothing helps, focusing on its impact on calibration of predictions, representations learned by the penultimate layer and effectiveness of knowledge distillation.

2D-projection of representations of three CIFAR100 classes. Without label smoothing, examples are spread, but with label smoothing each example is encouraged to be equally distant to the clusters of the other classes, attenuating intra-class variation and inter-class similarity structure.

The successes of our AI residents go beyond academic publishing. Their achievements include:

Organizing a workshop, bringing together experts in theoretical physics and deep learning, to explore how tools from physics can shed light on the theory of deep learning.
Founding Queer in AI, an organization for fostering a community of queer researchers and raising awareness of queer issues in AI/ML.
Organizing a hands-on Tensorflow tutorial on using Deep Learning for Natural Language Processing.
Automatically learning neural net architectures with AdaNet, an open-source, TensorFlow-based framework.
Developing Coconet, the model behind the first AI-powered Doodle (created to celebrate renowned German composer and musician Johann Sebastian Bach).

Also, beginning with the next program cycle, residents will be hosted for a duration of 12 months, with the option of extending up to 18 months! This exciting shift comes as part of our effort to improve the overall program experience and outcomes for residents as the program continues to grow and scale.

If you are interested in joining our fifth cohort, applications for the 2020 Google AI Residency program are now open! Visit g.co/airesidency/apply for more information on how to apply. Please submit your application as soon as possible, as we will be considering candidates on a rolling basis. Please see g.co/airesidency for more resident profiles, past resident publications, blog posts and stories. We can’t wait to see where the next year will take us, and hope you’ll consider joining our research teams across the world!

Highlights from the 3rd Cohort of the Google AI Residency Program

Written on November 6, 2019. Posted in Google.

Posted by Katie Meckley, Program Manager, Google AI Residency

This fall marks the successful conclusion for the third cohort of the Google AI Residency Program. Started in 2016 with 27 individuals in Mountain View, CA, the 12-month program has grown to nearly 100 residents from nine locations across the globe. Program participants have gone on to great success in PhD programs, academia, non-profits, and industry. Many have also become full-time Google researchers.

A generative model for Scalable Vector Graphics (SVGs), which can be used to aid designers in generating fonts.

A method to learn GANs using discrepancy divergence, a measure that accounts for both the loss function and hypothesis set to provide theoretical learning guarantees.

As more generators are added to the DGAN ensemble more modes in the real distribution are covered. From left to right: 1 generator, 5 generators, and 10 generators.

A likelihood ratio method for deep generative models that effectively corrects for confounding background statistics to improve out-of-distribution (OOD) detection, and a new benchmark dataset for OOD detection in genomics.

A study showing when label smoothing helps, focusing on its impact on calibration of predictions, representations learned by the penultimate layer and effectiveness of knowledge distillation.

The successes of our AI residents go beyond academic publishing. Their achievements include:

Organizing a workshop, bringing together experts in theoretical physics and deep learning, to explore how tools from physics can shed light on the theory of deep learning.
Founding Queer in AI, an organization for fostering a community of queer researchers and raising awareness of queer issues in AI/ML.
Organizing a hands-on Tensorflow tutorial on using Deep Learning for Natural Language Processing.
Automatically learning neural net architectures with AdaNet, an open-source, TensorFlow-based framework.
Developing Coconet, the model behind the first AI-powered Doodle (created to celebrate renowned German composer and musician Johann Sebastian Bach).

SRL Diagnostics-Microsoft consortium creates new AI tool to diagnose cervical cancer faster

Written on November 6, 2019. Posted in Microsoft.

The post SRL Diagnostics-Microsoft consortium creates new AI tool to diagnose cervical cancer faster appeared first on The AI Blog.

[Discussion] Is MINE(Mutual Information Neural Estimation) also helpful for reducing Mutual Information problem?

Written on November 6, 2019. Posted in Reddit MachineLearning.

Hello, i got a old-fashioned but confused question about Mutual Information Neural Estimation(MINE), 2018 ICML.

In the paper, the lower bound of mutual information is estimated with neural-net-parameterized function (what is called as statistics network), and various experiments were held including information bottleneck, which reduces I(X; Z).

It’s very well-written with theoretical background, but i’m stucked with reimplement the IB results; Unfortunately the paper doesn’t provides full details about IB section; So if you have any kind of experience with employing MINE to reducing mutual information, it’d be a big pleasure if you share the experience. I made a statistics network following the paper, and optimize the statistics network while employ its estimated MI lower bound to the I(X; Z) regularizer. But it seems very volatile to initial value of exponential_moving_average(exp(t)). My error rate is hang around 1.5% which is even worse than vanila FCN.

Also, i’m not fully convinced how such MI lower-bound estimating models are greatful to reducing MI problems; Is reducing the ‘approximated’ lower bound of MI guarantee the practical reduction of MI? I think optimizing the MI estimator while also reducing such estimated MI lower bound might be not stable; as GAN, it may be kind of minmax training. On the otherhand, if we are consistent with both statistics network(increase lowerbound) and our designed loss(also increasing lowerbound), i think there is no problem. How do you think about it?

submitted by /u/pky3436
[link] [comments]

[Discussion] Is MINE (Mutual Information Neural Estimation) suitable for reducing the mutual information?

Written on November 6, 2019. Posted in Reddit MachineLearning.

Hello, i got a old-fashioned but confused question about Belghazi et al., Mutual Information Neural Estimation, ICML 2018.

In the paper, the lower bound of mutual information is achieved by neural-net-parameterized function (what they call ‘statistics network’), and various experiments are conducted including information bottleneck which is case of ‘reducing’ I(X; Z).

Here i’m quite interested with reducing mutual information, so i started to regenerate their results, but it’s quite stucked.

Unfortunately not much details about IB implementation are included in paper, so if you have any experience employing MINE to reduce mutual information, it’d be a big pleasure, please share your way.

The paper is well-written with clear theoretical background, but i’m not sure how lowering the ‘approximated lower bound’ is helpful to reduce the actual mutual information. For those kinds of lower-bound mutual information models; Do you think those models are also practically useful to reduction of MI?

submitted by /u/pky3436
[link] [comments]

Under the Microscope: Top Pathology Lab Fuses Data Sources to Develop Cancer-Detecting AI

Written on November 6, 2019. Posted in NVIDIA.

Pathologists agreed just three-quarters of the time when diagnosing breast cancer from biopsy specimens, according to a recent study.

The difficult, time-consuming process of analyzing tissue slides is why pathology is one of the most expensive departments in any hospital.

Faisal Mahmood, assistant professor of pathology at Harvard Medical School and the Brigham and Women’s Hospital, leads a team developing deep learning tools that combine a variety of sources — digital whole slide histopathology data, molecular information, and genomics — to aid pathologists and improve the accuracy of cancer diagnosis.

Mahmood, who heads his eponymous Mahmood Lab in the Division of Computational Pathology at Brigham and Women’s Hospital, spoke this week about this research at GTC DC, the Washington edition of our GPU Technology Conference.

The variability in pathologists’ diagnosis “can have dire consequences, because an uncertain determination can lead to more biopsies and unnecessary interventional procedures,” he said in a recent interview. “Deep learning has the potential to assist with diagnosis and therapeutic response prediction, reducing subjective bias.”

Depending on the type of cancer and the pathologist’s level of experience, it can take 15 minutes or more for a pathologist to analyze a biopsy slide. If a single patient has a couple dozen slides, it can add up quick.

And to decide on a treatment plan, doctors also take into account other data sources like patient and familial medical history, as well as molecular and genomic data when it’s available.

Mahmood’s team uses NVIDIA GPUs on premises and in the cloud to develop its AI tools for pathology image analysis that incorporates all of these data sources.

“By working with whole slide images and fusing multimodal data sources we are algorithmically moving closer and closer to the clinical workflow,” Mahmood said. “This will enable us to run prospective studies with AI-assisted pathology diagnosis tools that use multimodal data.”

AI Sees the Big Picture

Digitized whole slide images taken during a tissue biopsy are huge — each can be more than 100,000 by 100,000 pixels. To efficiently compute with such large files, deep learning developers often choose to chop a slide into individual patches, making it easier for a neural network to process. But this tactic makes it incredibly time-consuming for researchers to hand-label the training data.

The Mahmood Lab is developing deep learning models that parse whole tissue slides at once in a data-efficient method, using NVIDIA GPUs to accelerate training and inference of their neural networks. These models can be used for patient selection and stratification into treatment groups for precision therapies.

For prototyping their deep learning models, and for inference, the team relies on four on-prem machines with NVIDIA GPU clusters. To train graph convolutional networks and contrastive predictive coding models with large pathology images, the researchers use NVIDIA V100 Tensor Core GPUs in Google Cloud.

“The modern GPU is what gives us the ability to train deep learning models on whole slides,” said Max Lu, a researcher in the Mahmood Lab. “The benefit is that it doesn’t require modifying the current clinical workflow, because pathologists are analyzing and preparing reports for whole slides anyways.”

Joining Sources

Pathologists often make their determinations using a wealth of data ranging from tissue slides, immunohistochemistry markers and genomic profiles. But most current deep-learning based diagnosis methods rely on a single data source or on trivial methods of fusing information.

This led Mahmood Lab researchers to develop mechanisms that combine microscope and genomic data in a much more heuristic and holistic manner. Initial results suggest that adding information from genomic profiles and graph convolutional networks can improve diagnostic and prognostic models.

Sliding into the Pathology Workflow

Mahmood sees two potential ways in which deep learning could be incorporated into pathologists’ workflow. AI-annotated slide images could be used as a second opinion for pathologists to help improve the quality and consistency of diagnoses.

Or, computational pathology tools could screen out all the negative cases, so that pathologists only need to review biopsy slides that are likely positive, significantly reducing their workloads. There’s a precedent for this: In the 1990s, hospitals began using third-party companies to scan and stratify pap smear slides, throwing out all the negative cases.

“If there are 40,000 breast cancer tissue slides and 20,000 are negative, that half would be stratified out and the pathologist wouldn’t see it,” Mahmood said. “Just by reducing the pathologist’s burden, variability is likely to go down.”

To test and validate their algorithms, the researchers plan to conduct retrospective and prospective studies using biopsy data from the Dana Farber Cancer Institute. They will study whether a pathologist’s analysis of a biopsy slide changes after seeing the algorithm’s determination — and whether using AI reduces variation in diagnosis.

Mahmood Lab researchers will present their deep learning projects at the NeurIPS conference’s ML4H workshop in December.

Main image shows a whole slide of keratocanthoma, a type of skin tumor. Image by Alex Brollo, licensed from Wikimedia Commons under CC BY-SA 3.0.

The post Under the Microscope: Top Pathology Lab Fuses Data Sources to Develop Cancer-Detecting AI appeared first on The Official NVIDIA Blog.

[D] Deep learning- agent for stock investment.

Written on November 6, 2019. Posted in Reddit MachineLearning.

For last few months i am trying to create an agent which will take an input amount and a stock to invest on along with the time for investments.

It comprised of an deep learning algorithm which will predict, sentiment analysis of news related to that company and related companies, scraping of data from google trends, stock data for that company and related company for training. But i am struggling to create an agent myself which could use these data for dummy investments.

submitted by /u/tmaloo
[link] [comments]