Author: torontoai

[D] Best approach to fine-classification of image objects?

Written on December 10, 2019. Posted in Reddit MachineLearning.

I’m basically looking to make a classifier that gives a binary (yes/no) response to an image’s subject matching my model. For example let’s say I wanted it to match copperhead snakes and only copperhead snakes. Is there a good cloud solution for this?

I tried AWS Rekognition but it wanted a minimum of 2 labels and I felt the results wouldn’t be good enough after doing a test run with ~120 images. I know that’s relatively small but don’t want to waste time/money with a cloud solution if it won’t get me there.

Will Google Vision have better results? Would I need to roll my own solution in Python? Or is it unrealistic for a novice solution to tell one type of snake from another?

submitted by /u/TRAINS_CHOOCHOO
[link] [comments]

[D] Concerns about “Face Beautification: Beyond Makeup Transfer”

Written on December 10, 2019. Posted in Reddit MachineLearning.

I came across the paper “Face Beautification: Beyond Makeup Transfer” and was appalled at the poor ethical and scientific practice shown by the paper. I emailed the PC and the D&I chairs, but I wanted to share my critique with the community as well:

I came across the paper “Face Beautification: Beyond Makeup Transfer” that was published at NeurIPS this year. I was deeply concerned by the apparent complete lack of care for the social and ethical repercussions of the paper. The goal of the paper is to change photos of women to make them more attractive. While it may be possible to do this in a way that isn’t objectionable, the paper there is zero discussion of or acknowledgement of the social, political, and power-dynamical (is that a word?) aspects of what is judged as attractive. The paper also contains serious methodological issues and blatantly contradicts itself in a fashion that I would expect to disqualify the paper from publication in the first place.

The examples in the paper make it clear that the algorithm’s concept of “attractive” is “light skinned white people.” Of the 114 demo examples of computer-generated “attractive people” in the paper, 100% are white. Not only that, almost all of them have extremely light skin. Only a couple of the shown inputs appear to be non-white people (e.g., Table 2 appears to contain a South Asian woman), and the algorithm clearly makes them into white people both by lightening their skin and by changing other morphological features to make the person appear more white. None of the inputs appear to be black people. Even among white people it strongly prefers people with lighter skin; there are zero examples where the algorithm appears to darken the skin tone of the person to be beautified and in the majority of cases it is significantly lightened.

This isn’t just an issue of using white people as “attractive references,” as it even happens when the reference attractive image is a photo of a non-white person, as seen in the table at the top of the first page. Two east Asian people are used as reference images to beautify white people, but the resulting image has typically white features such as a less ovular face shape and doubled eyelids.

Not only does this appear as a persistent pattern in the images, the authors don’t even mention that it happens, let alone critically engage with this. Given how much NeurIPS appears to pride itself on social awareness in AI research, I am saddened and disheartened to see that this paper was viewed as having sufficient merit and ethical practice to warrant publication.

Another major issue is the very last paragraph of their paper. It says

>Personalized beautification is expected to attract increasingly more attention in the incoming years. This work we have only focused on the beautification of female Caucasian faces. A similar question can be studied for other populations even though the relationship between gender, race, cultural background and the perception of facial attractiveness has remained under-researched in the literature. How can AI help reshape the practice of personal makeup and plastic surgery is an emerging field for future research.

This paragraph is clearly false for several reasons. As I mentioned, they have reference photos of non-Caucasian people in the paper itself and appear to input at least a couple non-Caucasian people. Secondly, the authors use data sets that contain a large number of non-Caucasian people. Since they mention the number of training and testing data points used, it is easy to verify that either their “Experimental Setup” section is not wrong or this paragraph is. Given that the images given as examples in the paper itself appear to falsify this paragraph, it seems clear that this paragraph is not true. At no point in the entire paper other than this paragraph do they say anything about only being interested in Caucasian people, and they do not mention Sundering the data to Caucasians. They do mention subsetting the data to women.

While I generally believe in making the most charitable assumptions, it seems uncredible that this might be a mistake or that the authors might be unaware that this paragraph is false. Not only do the reference images in their own paper falsify it, one of their data sets is drawn from a paper titled “SCUT-FBP5500‡ : A Diverse Benchmark Dataset for Multi-Paradigm Facial Beauty Prediction.” The very first page of this paper prominently features a graphic showing non-Caucasian people and the word “diverse” appears in their abstract three times. For their other data set (CelebA), the project website shows demo examples of non-Caucasian people. It does not appear possible for the authors to have done their due diligence and not noticed this. Additionally, it took me only a couple seconds to find dozens of papers studying “the relationship between gender, race, cultural background and the perception of facial attractiveness” and even Googling that phrase brings up lots of papers. Given the location of this paragraph within the paper and the fact that the paragraph blatantly contradicts the description of the experiment in the paper itself I fear this paragraph was added later after concerns about the paper were raised in order to mislead the reader and justify their poor ethical practice.

I also believe that the validation methodologies considered by the paper are extremely insufficient, even setting aside social and ethical concerns.The authors say

>To evaluate the image quality from human’s perception, we develop a user study and ask users to vote the most attractive one among ours and the baseline. 100 face images from testing set are submitted to Amazon Mechanical Turk (AMT), and each survey requires 20 users. We collect 2000 data points in total to evaluate human preference. The final results demonstrate the superiority of out model, showing in Table 1.

This is a rather small sample size, especially as no analysis of variance or estimation of uncertainty is done. Despite the extensive literature on how socioeconomic and racial factors influence assessments of attraction, these attributes are never discussed in the Mechanical Turk population. Additionally, they never actually assess if people find the computer generated images more attractive than the reference images, which is purportedly the entire purpose of the paper. They only ask if the image their algorithm generates is more attractive than other computer-generated images. The only further validation is that they ask their algorithm to score the beauty of the new images and find that on average the beauty rating goes up. This isn’t evidence of anything meaningful at all, as they’re using the same algorithm to evaluate if the beauty increased as they used inside their GAN to make the image more beautiful in the first place.

This is all the validation that they do in the paper.

You can find the paper on arXiv here: https://arxiv.org/abs/1912.03630

Edit: This post is based on the email I sent but I want to be clear that it is not the exact text. It has been edited for grammar and clarity. The content has not been substantively changed.

submitted by /u/StellaAthena
[link] [comments]

[D] What is the best implementation of a trainable TTS network for creating custom TTS voices?

Written on December 10, 2019. Posted in Reddit MachineLearning.

In this instance, TTS refers to Text-To-Speech.

As the title implies I am looking for the best way to train a network to produce high-quality text to speech results in a custom voice pulled from training data. Assuming access to large amounts of high-quality speech data from a single speaker, the English language, powerful machines, and extended training times what is the best implementation/codebase to use?

I have done quite a lot of research into this but have found my results to be quite confusing. Tacotron-2 seems to me to provide the highest quality results with an open-source implementation. However, implementations such as ESPnet(1) seem to be geared more towards testing different methods rather than developing your own custom voice. I am not new to Machine Learning but I am new to applying ML to audio or language-related problems thus I am very behind on my understanding of the state of such lines of research.

If I was looking to replicate something like the results from “Natural TTS Synthesis by Conditioning WaveNet on Mel Spectrogram Predictions”(2) where they used 20+ hours of data from an English to produce very natural sounding speech(3) what would be my best option? I just figured I would ask the experts of Reddit before I took the plunge on setting up a codebase and dataset only to realize there were significantly better options available.

Thanks!

(1) https://github.com/espnet/espnet

(2)(paper link) https://arxiv.org/abs/1712.05884

(3)(audio sample link) https://google.github.io/tacotron/publications/tacotron2/index.html

submitted by /u/blackfish_88
[link] [comments]

[D] attribution models

Written on December 10, 2019. Posted in Reddit MachineLearning.

Hello,

Suppose I have the following data:

—A——–A——B——C—B———-A———–X

–B-A—-B——A——C—C—–B—X

A, B, and C represent different contacts or touch points between the firm and the customer. X is the desired event for the customer (for example, a purchase). The —— represents the time in between events.

What kind of attribution models can I build to understand the relationship between A, B, C, and X?

I can create variables based on the recency and frequency of A, B, and C. What else can I do?

I read somewhere that neural network (maybe recurrent neural network) can handle non-tabular data a lot better. Can I treat those sequences of events and their timing as non-tabular data and utilize them?

Please educate me or provide pointers. Thanks!

submitted by /u/sonicking12
[link] [comments]

[D] Early Career Advice for a Machine Learning PhD Student

Written on December 10, 2019. Posted in Reddit MachineLearning.

I worked at company A this past summer as a PhD intern (the position only required the candidate to be pursuing a Bachelor’s degree) where about half of my responsibility was directly related to machine learning. Overall, the experience went very well and they offered me the opportunity to continue working for them part time during the semester while I was away at school; it was implied that this could lead to further employment in the form of an internship or full time position in the future, but it was never formally stated or agreed upon.

Fast forward to now: Companies B, C, and D are contacting me with further internship opportunities. These companies would likely look better on a CV (or at least offer my CV a larger diversity of experience), and put me in roles that appear to align more closely with my career goals; I’m also not crazy about the industry of company A.

My Question: Is it unethical for me to accept a position with company B, C, or D this summer? I think it’s better for me as a researcher to get a diversity of experience by joining a new company, and the roles could teach me things I won’t learn by going back to company A a second time. That being said, I’m not ruling out company A as a possible landing location when I get a full time position after graduation; I don’t want to burn any bridges.

submitted by /u/green-top
[link] [comments]

[R] Ab-Initio Solution of the Many-Electron Schrödinger Equation with Deep Neural Networks questions

Written on December 10, 2019. Posted in Reddit MachineLearning.

Hi,

My question is how the Laplacian was implemented in tensorflow?

Thank you!

submitted by /u/TotoroPet
[link] [comments]

Fairness Indicators: Scalable Infrastructure for Fair ML Systems

Written on December 10, 2019. Posted in Google.

Posted by Catherina Xu and Tulsee Doshi, Product Managers, Google Research

While industry and academia continue to explore the benefits of using machine learning (ML) to make better products and tackle important problems, algorithms and the datasets on which they are trained also have the ability to reflect or reinforce unfair biases. For example, consistently flagging non-toxic text comments from certain groups as “spam” or “high toxicity” in a moderation system leads to exclusion of those groups from conversation.

In 2018, we shared how Google uses AI to make products more useful, highlighting AI principles that will guide our work moving forward. The second principle, “Avoid creating or reinforcing unfair bias,” outlines our commitment to reduce unjust biases and minimize their impacts on people.

As part of this commitment, at TensorFlow World, we recently released a beta version of Fairness Indicators, a suite of tools that enable regular computation and visualization of fairness metrics for binary and multi-class classification, helping teams take a first step towards identifying unjust impacts. Fairness Indicators can be used to generate metrics for transparency reporting, such as those used for model cards, to help developers make better decisions about how to deploy models responsibly. Because fairness concerns and evaluations differ case by case, we also include in this release an interactive case study with Jigsaw’s Unintended Bias in Toxicity dataset to illustrate how Fairness Indicators can be used to detect and remediate bias in a production machine learning (ML) model, depending on the context in which it is deployed. Fairness Indicators is now available in beta for you to try for your own use cases.

What is ML Fairness?
Bias can manifest in any part of a typical machine learning pipeline, from an unrepresentative dataset, to learned model representations, to the way in which the results are presented to the user. Errors that result from this bias can disproportionately impact some users more than others.

To detect this unequal impact, evaluation over individual slices, or groups of users, is crucial as overall metrics can obscure poor performance for certain groups. These groups may include, but are not limited to, those defined by sensitive characteristics such as race, ethnicity, gender, nationality, income, sexual orientation, ability, and religious belief. However, it is also important to keep in mind that fairness cannot be achieved solely through metrics and measurement; high performance, even across slices, does not necessarily prove that a system is fair. Rather, evaluation should be viewed as one of the first ways, especially for classification models, to identify gaps in performance.

The Fairness Indicators Suite of Tools
The Fairness Indicators tool suite enables computation and visualization of commonly-identified fairness metrics for classification models, such as false positive rate and false negative rate, making it easy to compare performance across slices or to a baseline slice. The tool computes confidence intervals, which can surface statistically significant disparities, and performs evaluation over multiple thresholds. In the UI, it is possible to toggle the baseline slice and investigate the performance of various other metrics. The user can also add their own metrics for visualization, specific to their use case.

Furthermore, Fairness Indicators is integrated with the What-If Tool (WIT) — clicking on a bar in the Fairness Indicators graph will load those specific data points into the the WIT widget for further inspection, comparison, and counterfactual analysis. This is particularly useful for large datasets, where Fairness Indicators can be used to identify problematic slices before the WIT is used for a deeper analysis.

Using Fairness Indicators to visualize metrics for fairness evaluation.

Clicking on a slice in Fairness Indicators will load all the data points in that slice inside the What-If Tool widget. In this case, all data points with the “female” label are shown.

The Fairness Indicators beta launch includes the following:

pip package: Includes Tensorflow Model Analysis (TFMA), Fairness Indicators, Tensorflow Data Validation (TFDV), What-If Tool, and example Colabs:
- Fairness Indicators Example Colab — an introduction to Fairness Indicators usage
- Fairness Indicators for TensorBoard — a TensorBoard plug-in usage example
- Fairness Indicators with TFHub Embeddings — a Colab that investigates the effects of different embeddings on downstream fairness metrics
- Fairness Indicators with Cloud Vision API’s Face Detection Model — a Colab showing how Fairness Indicators can be used to generate evaluation results for model cards
GitHub repository: Source code
Guidance for usage: Fairness is highly contextual, and it’s important to carefully think through each use case and potential implications for users. This document provides guidance for selecting groups and metrics, and highlights evaluation best practices.
Case Study: Interactive case study on using Fairness Indicators, showing how Jigsaw’s Conversation AI team detects bias in a classification model using the Toxic Comment Classification dataset.

How To Use Fairness Indicators in Models Today
Fairness Indicators is built on top of TensorFlow Model Analysis, a component of TensorFlow Extended (TFX) that can be used to investigate and visualize model performance. Based on the specific ML workflow, Fairness Indicators can be incorporated into a system in one of the following ways:
If using TensorFlow models and tools, such as TFX:

Access Fairness Indicators as part of the Evaluator component in TFX
Access Fairness Indicators in TensorBoard when evaluating other real-time metrics

If not using existing TensorFlow tools:

Download the Fairness Indicators pip package, and use Tensorflow Model Analysis as a standalone tool

For non-TensorFlow models:

Use Model Agnostic TFMA to compute Fairness Indicators based on the output of any model

Fairness Indicators Case Study
We created a case study and introductory video that illustrates how Fairness Indicators can be used with a combination of tools to detect and mitigate bias in a model trained on Jigsaw’s Unintended Bias in Toxicity dataset. The dataset was developed by Conversation AI, a team within Jigsaw that works to train ML models to protect voices in conversation. Models are trained to predict whether text comments are likely to be abusive along a variety of dimensions including toxicity, insult, and sexual explicitness.

The primary use case for models such as these is content moderation. If a model penalizes certain types of messages in a systematic way (e.g., often marks comments as toxic when they are not, leading to a high false positive rate), those voices will be silenced. In the case study, we investigated false positive rate on subgroups sliced by gender identity keywords that are present in the dataset, using a combination of tools (Fairness Indicators, TFDV, and WIT) to detect, diagnose, and take steps toward remediating the underlying problem.

What’s next?
Fairness Indicators is only the first step. We plan to expand vertically by enabling more supported metrics, such as metrics that enable you to evaluate classifiers without thresholds, and horizontally by creating remediation libraries that utilize methods, such as active learning and min-diff. Because we believe it is important to learn through real examples, we hope to ground our work in more case studies to be released over the next few months, as more features become available.

To get started, see the Fairness Indicators GitHub repo. For more information on how to think about fairness evaluation in the context of your use case, see this link.

We would love to partner with you to understand where Fairness Indicators is most useful, and where added functionality would be valuable. Please reach out at tfx@tensorflow.org to provide any feedback on your experience!

Acknowledgements
The core team behind this work includes Christina Greer, Manasi Joshi, Huanming Fang, Shivam Jindal, Karan Shukla, Osman Aka, Sanders Kleinfeld, Alicia Chang, Alex Hanna, and Dan Nanas. We would also like to thank James Wexler, Mahima Pushkarna, Meg Mitchell and Ben Hutchinson for their contributions to the project.

[N] Kaggle Deep Fake detection: 470Gb of videos, $1M prize pool 💰💰💰

Written on December 10, 2019. Posted in Reddit MachineLearning.

https://www.kaggle.com/c/deepfake-detection-challenge

Some people were concerned with the possible flood of deep fakes. Some people were concerned with low prizes on Kaggle. This seems to address those concerns.

submitted by /u/sorrge
[link] [comments]

Amazon Polly Neural Text-to-Speech voices now available in Sydney Region

Written on December 10, 2019. Posted in Amazon.

Amazon Polly turns text into lifelike speech for voice-enabled applications. AWS is excited to announce the general availability of all Neural Text-to-Speech (NTTS) voices in the Asia Pacific (Sydney) Region. These voices deliver groundbreaking improvements in speech quality through a new machine learning approach. If you are in the Sydney Region, you can now synthesize 13 NTTS voices (eight US English, three UK English, one US Spanish, and one Brazilian Portuguese) available in the Amazon Polly portfolio.

In addition, Amazon Polly’s two speaking style voices are available in US English (Matthew and Joanna). Newscaster simulates the tone of a news anchor, and Conversational simulates the tone of a friendly conversation. Both are built using the same NTTS technology.

Listen to samples of both speaking styles:

Newscaster

Listen now

Voiced by Amazon Polly

Conversational

Listen now

Voiced by Amazon Polly

The entire Amazon Polly portfolio of 60+ voices (Neural and Standard) across 29 languages is now available in the Asia Pacific (Sydney) region. Visit the Amazon Polly documentation for the full list of text-to-speech voices, and log in to the Amazon Polly console to try them out! Simply set the ‘engine‘ parameter to ‘neural‘, and select one of the 4 AWS regions that support NTTS voices.

About the Author

Ankit Dhawan is a Senior Product Manager for Amazon Polly, a technology enthusiast, and a huge Liverpool FC fan. When not working on delighting our customers, you will find him exploring the Pacific Northwest with his wife and dog. He is an eternal optimist, loves reading biographies, and playing poker. You can indulge him in a conversation on technology, entrepreneurship, or soccer anytime of the day.

[D] Why is everyone against PhD for money/prestige/job security reasons?

Written on December 10, 2019. Posted in Reddit MachineLearning.

People who go to med school spend an equivalent of time and effort to become physicians/surgeons and many of them openly admit to doing it for the money/prestige/job security. I am definitely interested in machine learning and would be happy to do research, but why is it wrong to say I also want to it for those reasons?

Asking as someone considering a PhD in CS at top 20-40 after my B.S. in math given that it is funded…

submitted by /u/mfdoomlives
[link] [comments]

Blog

Learn About Our Meetup

5000+ Members

MEETUPS

JOB POSTINGS

CONTACT

Author: torontoai

[D] Best approach to fine-classification of image objects?

[D] Concerns about “Face Beautification: Beyond Makeup Transfer”

[D] What is the best implementation of a trainable TTS network for creating custom TTS voices?

[D] attribution models

[D] Early Career Advice for a Machine Learning PhD Student

[R] Ab-Initio Solution of the Many-Electron Schrödinger Equation with Deep Neural Networks questions

Fairness Indicators: Scalable Infrastructure for Fair ML Systems

[N] Kaggle Deep Fake detection: 470Gb of videos, $1M prize pool 💰💰💰

Amazon Polly Neural Text-to-Speech voices now available in Sydney Region

About the Author

[D] Why is everyone against PhD for money/prestige/job security reasons?