Machine Learning Software Senior or Staff Engineer – Qualcomm – Markham, ON
From Qualcomm – Thu, 12 Sep 2019 21:26:19 GMT – View all Markham, ON jobs
📜Paper: https://joss.theoj.org/papers/571753bc54c5d6dd36382c3d801de41d
🔊Demo: https://open.unmix.app
🔥PyTorch: https://github.com/sigsep/open-unmix-pytorch
🔻NNabla: https://github.com/sigsep/open-unmix-nnabla
🔶TF2: t.b.a.
📓Colab: https://colab.research.google.com/drive/1mijF0zGWxN-KaxTnd0q6hayAlrID5fEQ
It is our great pleasure to announce the release of Open-unmix, a MIT-licensed python implementation for DNN-based music separation.
In the recent years, deep learning-based systems could break a long-standing crystal ceiling, and finally allow high-quality music separation. This provoked a raising interest from both the industry and the machine learning community (like /r/ML)
However, until now, no open-source implementation was available that matches the performance of the best systems proposed more than four years ago. This lead to a waste of time from both the points of view of sheer performance optimization and scientific comparison with the state of the art. Not being able to reproduce state of the art performance makes it difficult to clearly identify the sources for discrepancies and rooms for improvement.
In this context, we release Open-Unmix (UMX) as closing this gap by providing a reference implementation for DNN-based music separation. It serves two main purposes. First, it is intended to academic researchers for serving as a baseline method that is easy to compare to and build upon. Second, the availability of a pre-trained model allows bringing music separation to the enthusiastic end users and artists.
Open-unmix is presented in a paper that has just been published in the Journal of Open Source Software. You may download the paper PDF here
Open-unmix comes in several DNN frameworks:
https://sigsep.github.io/open-unmix/
Open-unmix has been especially designed to combine well with the following datasets:
=> Both datasets are available at https://sigsep.github.io/datasets/musdb.html
Note:
If you want to compare separation models to existing source separation literature or if want compare to SiSEC 2018 participants, please use the standard MUSDB18 dataset, instead.
We provide pre-trained models trained on both MUSDB18 and MUSDB18-HQ that reach state-of-the-art performance of 6.32 dB SDR (median of medians) on vocals on MUSDB18 test data. This significantly outperforms any model we are aware of that was trained on MUSDB18 only.
The pre-trained models are automatically bundled/downloaded when using the pytorch implementation.
Further information for both models such as evaluation scores can be downloaded from zenodo:
Open-unmix was recently proposed during a tutorial held at EUSIPCO 2019. This features:
The slides of the tutorial as well as self-contained colab notebooks can be found on the tutorial site.
Open-unmix is part of a whole ecosystem enabling easy research on source separation for Python users. Several distinct and independent projects were released in the recent years in this effort to make it possible for researchers to reproduce state of the art performance in this domain.
A reliable python package that implements the multichannel wiener filter and related filtering methods.
https://github.com/sigsep/norbert
We released the new version 0.3.0 of our popular musdb tools. This releases makes it simpler to use musdb inside your data loading framework thus we pro
https://github.com/sigsep/sigsep-mus-db
museval makes it easy to compare the performance of any new method under investigation to both Open-unmix and the participants of SiSEC18.
https://github.com/sigsep/sigsep-mus-eval
Please note that we are also working on some version of open-unmix that has been trained on a significantly larger dataset and that achieves unprecedented separation performance. Please feel free to contact us for demonstrations / industrial collaborations / licensing on this matter.
We look forward to your feedback and we hope that you will find Open-unmix useful!
submitted by /u/faroit
[link] [comments]
I’m looking to do some experiments using the Fast Fourier Transform to do CNNs. From what I’ve seen, many common frameworks (Chainer, Keras, PyTorch, TensorFlow) don’t provide support for this. They typically implement a FFT or DFT function but not a FT convolutional layer. I could implement it from scratch, but there’s some finicky implementation aspects I was hoping to avoid worrying about.
Does any framework have native Fourier-based CNNs? Alternatively, pointers to SOTA implementations on GitHub I can use for reference would be highly appreciated. Ideally in Chainer, as that’s the framework I have experience with.
submitted by /u/StellaAthena
[link] [comments]
Hello /r/ML,
We are pleased to share with you our meta-learning library, that started as a project at the PyTorch hackathon.
learn2learn is a PyTorch library for all things meta-learning. Our goal is to support as many meta-learning algorithms as possible (be it few-shots, meta-descent, or meta-RL) and to enable researchers to develop better methods and easily compare against existing literature.
Our current features include:
If this is of interest to you, have a look at the following links:
Let us know what you think and how we can help you in your research!
PS: learn2learn was also accepted as a poster to the PyTorch Dev Conference, so you’ll know all about it there!
submitted by /u/praat33k
[link] [comments]
Abstract – Object detection, the computer vision task dealing with detecting instances of objects of a certain class (e.g., ’car’, ’plane’, etc.) in images, attracted a lot of attention from the community during the last six years. This strong interest can be explained not only by the importance this task has for many applications but also by the phenomenal advances in this area since the arrival of deep convolutional neural networks (DCNNs). This article reviews the recent literature on object detection with deep CNN, in a comprehensive way. This study covers not only the design decisions made in modern deep (CNN) object detectors, but also provides an in-depth perspective on the set of challenges currently faced by the computer vision community, as well as some complementary and new directions on how to overcome them. In its last part it goes on to show how object detection can be extended to other modalities and conducted under different constraints. This survey also reviews in its appendix the public datasets and associated state-of-the-art algorithms.
Page -> https://arxiv.org/abs/1809.03193
submitted by /u/gnavihs
[link] [comments]
This is a two part blog post on a Project that aims to do question answering, using a pretrained BERT.
The first part teaches about Transformers, and the history that leads up to this Architecture. -> https://blog.scaleway.com/2019/building-a-machine-reading-comprehension-system-using-the-latest-advances-in-deep-learning-for-nlp/
The second part focuses on using a pre-trained BERT (in PyTorch) and how to do question answering. There’s code and you can try it on your own dataset easily 🙂
-> https://blog.scaleway.com/2019/understanding-text-with-bert/
submitted by /u/ilnmtlbnm
[link] [comments]
Doctors’ handwriting is notoriously difficult to read. Even more cryptic is medical coding — the process of turning a clinician’s notes into a set of alphanumeric codes representing every diagnosis and procedure.
Although this system is used in over 100 countries worldwide, accurate coding is of particular significance in the U.S., where medical codes form the basis for the bills doctors, clinics and hospitals issue to insurance providers and patients.
More than 150,000 codes are used in the U.S.’s adaptation of the International Classification of Diseases, a cataloging standard developed by the World Health Organization.
The diagnostic code for a pedestrian hit by a pickup truck? V03.10XA. Type 2 diabetes diagnosis? E11.9. There are also a set of procedural codes for everything a doctor might do, like put a cast on a patient’s broken right forearm (2W3CX2Z) or insert a pacemaker into a coronary vein (02H40NZ).
After every doctor’s appointment or procedure, a clinician’s summary of the interaction is converted into these codes. When done by humans, the turnaround time for medical chart coding — within a healthcare organization or at a private firm — is often two days or more. Natural language processing AI, accelerated by GPUs, can shrink that time to minutes or seconds.
San Francisco-based Fathom is developing deep learning tools to automate the painstaking medical coding process while increasing accuracy. The startup’s tools can help address the shortage of trained clinical coders, improve the speed and precision of billing, and allow human coders to focus on complex cases and follow-up queries.
“Sometimes you have to go back to the doctor to ask for clarification,” said Christopher Bockman, co-founder and chief technology officer of Fathom, a member of the NVIDIA Inception virtual accelerator program. “The longer that process takes, the harder it is for the doctor to remember what happened.”
Fathom uses NVIDIA P100 and V100 Tensor Core GPUs in Google Cloud for both training and inference of its deep learning algorithms. Founded in 2016, the company now works with several of the largest medical coding operations in the U.S., representing more than 200 million annual patient encounters. Its tools can reduce human time spent on medical coding by as much as 90 percent.
At any doctor’s appointment, emergency room visit or surgical procedure, healthcare providers type up notes describing the interaction. While there are some standardized formats, these medical records differ by hospital, by type of appointment or procedure, and by whether the note is written during the patient interaction or after.
Medical coders make sense of this unstructured text, categorizing every test, treatment and procedure into a list of codes. Once coded, a healthcare provider’s billing department turns the reports into an invoice to collect payments from insurance providers and patients.
It’s a messy process — for a human or an AI. Human coders agree with each other less than two-thirds of the time in key scenarios, studies show. And research has found that half or more medical charts have coding errors.
“The challenge for us is these notes can vary quite a bit,” Bockman said. “There’s a push to standardize, but that tends to make the doctor’s job a lot harder. Human health is complex, so it’s hard to come up with a format that works for every case.”
As a machine learning problem, medical coding shares elements of two kinds of tasks: multilabel classification and sequence-to-sequence NLP. An effective AI must understand the text in a doctor’s note and accurately tag it with a list of diagnoses and procedures organized in the right order for billing.
Fathom is tackling this challenge, aided by tools such as NVIDIA’s GPU-optimized version of BERT, a leading natural language understanding model. The team uses the TensorFlow deep learning framework and relies on the mixed-precision training provided by Tensor Cores to accelerate the large-scale processing of medical documents that vary widely in size.
Using NVIDIA GPUs for inference allows Fathom to easily scale up to process upwards of millions of healthcare encounters per hour.
“While lowering costs matter, the ability to instantly add the capacity of thousands of medical coders to their operations has been the game-changer for our clients,” said Andrew Lockhart, Fathom’s co-founder and CEO.
Relying on NVIDIA GPUs on Google Cloud helps the team ramp its usage up and down based on demand.
“We have very bursty needs,” Bockman said, referring to the team’s fluctuating computational workload. “Sometimes we might be trying to retrain different variants of the same large model, while other times we’re doing a lot of experimentation or just doing inference. We might need a single GPU or many dozens of them.”
The startup chose Google Cloud, Bockman said, in part because the data is encrypted by default — one of the requirements for compliance with HIPAA and SOC 2 privacy requirements.
While medical coding is the main activity done today with doctor’s notes, unlocking the information contained in these health records could enable a wide range of use cases beyond billing and reimbursement, Bockman says.
AI that quickly and accurately analyzes medical charts and appointment records at scale can help doctors spot patient illnesses that may otherwise have been missed, predict likely patient outcomes, suggest treatment options — and even identify promising patient candidates for clinical trials.
The post Cure for the Common Code: San Francisco Startup Uses AI to Automate Medical Coding appeared first on The Official NVIDIA Blog.
Methods, systems, and apparatus, including computer programs encoded on computer storage media, for generating output sequences from input sequences. One of the methods includes obtaining an input sequence having a first number of inputs arranged according to an input order; processing each input in the input sequence using an encoder recurrent neural network to generate a respective encoder hidden state for each input in the input sequence; and generating an output sequence having a second number of outputs arranged according to an output order, each output in the output sequence being selected from the inputs in the input sequence, comprising, for each position in the output order: generating a softmax output for the position using the encoder hidden states that is a pointer into the input sequence; and selecting an input from the input sequence as the output at the position using the softmax output.
http://www.freepatentsonline.com/10402719.html
News from the UK is that the grave of some guy named Turing has been heard making noises since this came out.
What would happen if, by some stroke of luck, Google collapses and some company like Oracle buys its IP and then goes after any dude who installed PyTorch?
Why doesn’t Google come out with a systematic approach to secure these patents?
I am not too sure they are doing this *only* for defending against patent trolls anymore.
submitted by /u/metacurse
[link] [comments]