Author: torontoai

[D] How to handle variable input lenghts for tansformer encoder?

Written on November 30, 2019. Posted in Reddit MachineLearning.

I have variable text lenghts and want to use encoder like in this example

Example link

For RNNs I should use pack_padded_sequence, what should I do for transformer?

Or it is ok for transformer to get zeros in inputs?

submitted by /u/hadaev
[link] [comments]

[D] Using ML for a Graduate Project

Written on November 30, 2019. Posted in Reddit MachineLearning.

Hello,

As the title states, I will be using ML for a project of mine, and was hoping I could get some direction on where to start. I have some experience with ML in the past, but not using it in the way that I would like to for this project.

I always thought it was pretty cool watching people train models to play games, and decided that I would like to do something like that for my project. However, I am having some trouble getting started.

The game I decided on using for this project is an open-source game called SuperTuxKart. Basically, a racing game like Mario Kart. I wanted to train a model to run the time trials/beat the AI/use the abilities/etc.

The game is built using C++, but I was thinking of maybe using Python to hook into the game process and send commands to it like that. But again, this is not something I have done before, so if that is not the correct approach here, please let me know.

Any and all advice would be greatly appreciated.

Thanks!

submitted by /u/pilpod
[link] [comments]

[D] What do you do when your models are training?

Written on November 30, 2019. Posted in Reddit MachineLearning.

Besides browse /r/MachineLearning

submitted by /u/hdplus
[link] [comments]

[D] Interesting papers for RNN problem / Time-Series prediction with largely _known_ underlying actuators?

Written on November 30, 2019. Posted in Reddit MachineLearning.

A bit of background: For my master thesis will I predict the inverse dynamics of a robotic arm by matching the measured position, speed and acceleration of each joint to the measured torque applied to the respective joint. The measured torque can – for the purpose of this machine learning task – be seen as the ground truth and is currently determined through feedback controllers measuring how much the actual trajectory is off in comparison to the intended trajectory (though it would be advantageous to know this beforehand, hence the ML application). This basically makes the problem a supervised learning problem.

For a quick visualization: see the following graphs: I have to match the [position, speed and acceleration] of a 7 joint robot to the [applied torque] (the blue line is the actual measurement of applied torque to the first joint, the others are quick drafts of ML prediction systems). If you would like to read more about this problem, I can recommend this recent paper

Now to my issue: As I understand it, does this problem seem like a typical time-series problem well suited for RNNs. However all obvious underlying actuators (position, speed, acceleration) that will influence the predicted torque are known, making it therefore unnecessary for RNNs to detect some underlying time dependent pattern and therefore turning my problem into a relatively simple nondiscrete classification problem (I hope this is the correct term) – correct? However the non-obvious and hardly- or non measurable underlying actuators (e.g. friction, inertia, deflection, etc), that cause inverse dynamics prediction to be a Machine learning problem in the first place, may (or may not…) be time dependent. Considering this is the problem still a viable RNN problem as I understand it, even though the underlying actuators are largely known.

My question: Aside from me being very happy with you checking my logic (and also general feedback), would I also appreciate any links/keywords to research that looks into that RNN problem / Time series prediction with the twist that the underlying actuators are mostly known. I would also appreciate any links/keywords to recent research in the field of nondiscrete classificaiton that is promising, as I will approach the inverse dynamics problem as a nondiscrete classification problem as well as a time series prediction problem and compare the viability of both over the course of my thesis. I also have more or less read most research that specifically looks into inverse dynamics prediction problem, though I am hoping to get good research that looks at this problem in a more general way or looks at a similar problem in another application domain, so that I might employ it for the inverse dynamics prediction task.

Thank you for taking the time of reading my question, I very much appreciate it!

submitted by /u/OnePaulToRuleThemAll
[link] [comments]

[R] Is this NAS method beating EfficientNet in accuracy vs latency/FLOPs tradeoff? Once for All: Train One Network and Specialize it for Efficient Deployment

Written on November 30, 2019. Posted in Reddit MachineLearning.

this is the paper:

https://openreview.net/forum?id=HylxE1HKwS

They compare with MobileNet V3, and get 76.4% top1 accuracy with 238 MAdds (says flops, but I think it is wrong). While EfficientNet B0 ( https://arxiv.org/pdf/1905.11946.pdf ) gets 77.3 for 390 Madds (again, says flops, but that’s just wrong?)

So B0 gets 0.9% advantage but 1.6x the flops… there is no direct comparison between them, according to figure 5 seems the 76.4% keeps going up with flops

Any thoughts on this? seems it should get more attention

submitted by /u/skariel
[link] [comments]

NVIDIA Clara Federated Learning to Deliver AI to Hospitals While Protecting Patient Data

Written on November 30, 2019. Posted in NVIDIA.

With over 100 exhibitors at the annual Radiological Society of North America conference using NVIDIA technology to bring AI to radiology, 2019 looks to be a tipping point for AI in healthcare.

Despite AI’s great potential, a key challenge remains: gaining access to the huge volumes of data required to train AI models while protecting patient privacy. Partnering with the industry, we’ve created a solution.

Today at RSNA, we’re introducing NVIDIA Clara Federated Learning, which takes advantage of a distributed, collaborative learning technique that keeps patient data where it belongs — inside the walls of a healthcare provider.

Clara Federated Learning (Clara FL) runs on our recently announced NVIDIA EGX intelligent edge computing platform.

Federated Learning — AI with Privacy

Clara FL is a reference application for distributed, collaborative AI model training that preserves patient privacy. Running on NVIDIA NGC-Ready for Edge servers from global system manufacturers, these distributed client systems can perform deep learning training locally and collaborate to train a more accurate global model.

Here’s how it works: The Clara FL application is packaged into a Helm chart to simplify deployment on Kubernetes infrastructure. The NVIDIA EGX platform securely provisions the federated server and the collaborating clients, delivering everything required to begin a federated learning project, including application containers and the initial AI model.

NVIDIA Clara Federated Learning uses distributed training across multiple hospitals to develop robust AI models without sharing patient data.

Participating hospitals label their own patient data using the NVIDIA Clara AI-Assisted Annotation SDK integrated into medical viewers like 3D slicer, MITK, Fovia and Philips Intellispace Discovery. Using pre-trained models and transfer learning techniques, NVIDIA AI assists radiologists in labeling, reducing the time for complex 3D studies from hours to minutes.

NVIDIA EGX servers at participating hospitals train the global model on their local data. The local training results are shared back to the federated learning server over a secure link. This approach preserves privacy by only sharing partial model weights and no patient records in order to build a new global model through federated averaging.

The process repeats until the AI model reaches its desired accuracy. This distributed approach delivers exceptional performance in deep learning while keeping patient data secure and private.

US and UK Lead the Way

Healthcare giants around the world — including the American College of Radiology, MGH and BWH Center for Clinical Data Science, and UCLA Health — are pioneering the technology. They aim to develop personalized AI for their doctors, patients and facilities where medical data, applications and devices are on the rise and patient privacy must be preserved.

ACR is piloting NVIDIA Clara FL in its AI-LAB, a national platform for medical imaging. The AI-LAB will allow the ACR’s 38,000 medical imaging members to securely build, share, adapt and validate AI models. Healthcare providers that want access to the AI-LAB can choose a variety of NVIDIA NGC-Ready for Edge systems, including from Dell, Hewlett Packard Enterprise, Lenovo and Supermicro.

UCLA Radiology is also using NVIDIA Clara FL to bring the power of AI to its radiology department. As a top academic medical center, UCLA can validate the effectiveness of Clara FL and extend it in the future across the broader University of California system.

Partners HealthCare in New England also announced a new initiative using NVIDIA Clara FL. Massachusetts General Hospital and Brigham and Women’s Hospital’s Center for Clinical Data Science will spearhead the work, leveraging data assets and clinical expertise of the Partners HealthCare system.

In the U.K., NVIDIA is partnering with King’s College London and Owkin to create a federated learning platform for the National Health Service. The Owkin Connect platform running on NVIDIA Clara enables algorithms to travel from one hospital to another, training on local datasets. It provides each hospital a blockchain-distributed ledger that captures and traces all data used for model training.

The project is initially connecting four of London’s premier teaching hospitals, offering AI services to accelerate work in areas such as cancer, heart failure and neurodegenerative disease, and will expand to at least 12 U.K. hospitals in 2020.

Making Everything Smart in the Hospital

With the rapid proliferation of sensors, medical centers like Stanford Hospital are working to make every system smart. To make sensors intelligent, devices need a powerful, low-power AI computer.

That’s why we’re announcing NVIDIA Clara AGX, an embedded AI developer kit that can handle image and video processing at high data rates, bringing AI inference and 3D visualization to the point of care.

NVIDIA Clara AGX scales from small, embedded devices to sidecar systems to full-size servers.

Clara AGX is powered by NVIDIA Xavier SoCs, the same processors that control self-driving cars. They consume as little as 10W, making them suitable for embedding inside a medical instrument or running in a small adjacent system.

A perfect showcase of Clara AGX is Hyperfine, the world’s first portable point-of-care MRI system. The revolutionary Hyperfine system will be on display in NVIDIA’s booth at this week’s RSNA event.

Hyperfine’s system is among the first of many medical instruments, surgical suites, patient monitoring devices and smart medical cameras expected to use Clara AGX. We’re witnessing the beginning of an AI-enabled internet of medical things.

Hyperfine’s mobile MRI system uses an NVIDIA GPU and will be on display at NVIDIA’s booth.

The NVIDIA Clara AGX SDK will be available soon through our early access program. It includes reference applications for two popular uses — real-time ultrasound and endoscopy edge computing.

NVIDIA at RSNA 2019

Visit NVIDIA and our many healthcare partners in booth 10939 in the RSNA AI Showcase. We’ll be showing our latest AI-driven medical imaging advancements, including keeping patient data secure with AI at the edge.

Find out from our deep learning experts how to use AI to advance your research and accelerate your clinical workflows. See the full lineup of talks and learn more on our website.

The post NVIDIA Clara Federated Learning to Deliver AI to Hospitals While Protecting Patient Data appeared first on The Official NVIDIA Blog.

[D] How ICLR review scores changed during the discussion period

Written on November 30, 2019. Posted in Reddit MachineLearning.

Hi,

I have gathered some statistics about the publicly available ICLR review data. In particular, I was interested in how the scores were affected by the discussions.

In summary, 11.79% of all reviews changed their score (mostly improvements).

Surprisingly, reviewers who stated that they “I do not know much about this area” or “I made a quick assessment of this paper” were less likely to change their score.

Detailed results and code can be found here

submitted by /u/mlechlll
[link] [comments]

VP of Engineering – Intelletec – Toronto, ON

Written on November 30, 2019. Posted in Toronto Job Postings.

Strong knowledge of TensorFlow or other deep learning libraries ( Pytorch, Keras ). The software is powered by a proprietary machine vision neural network that…
From LocalWorkBC.ca – Sun, 01 Dec 2019 11:41:38 GMT – View all Toronto, ON jobs

[R] Filter Response Normalization Layer: Eliminating Batch Dependence in the Training of Deep Neural Networks

Written on November 30, 2019. Posted in Reddit MachineLearning.

submitted by /u/hardmaru
[link] [comments]

[Research] STFT within neural network pipeline

Written on November 30, 2019. Posted in Reddit MachineLearning.

I have been thinking and looking for an answer for this for a while, but I couldn’t really find a satisfactory solution on google (or maybe i’m not looking for the right thing), hence my post here.

Assume I have a GAN that generates raw audio waveforms. The generator is a convolutional neural network that produces raw audio waveforms, which are passed to a discriminator that evaluates it and backprop is performed. This is pretty straight forward.

But I found that my discriminator is pretty bad at distinguishing real from fake waveforms, therefor I would find it beneficial if I could convert the generated waveform to a spectrogram with an STFT and discriminate real from fake spectrograms.

I understand how the forward pass is performed, but my problem is with backprop. I understand that we compute an error based on the discriminator predictions and back propagate it through the discriminator, which is a standard CNN classifier. But now what happens in between the discriminator and generator? Do we perform an ISTFT on the back propagated error? And how is this done in keras or PyTorch? Would it be some special kind of intermediary layer? I would like to implement this, but I have no idea where to even start.

In general, how is a domain conversion handled within a neural network pipeline?

It would be really helpful if you could share your thoughts on this, or point me towards some work that has already been done on this. Thanks in advance and cheers!

submitted by /u/khawarizmy
[link] [comments]

Blog

Learn About Our Meetup

5000+ Members

MEETUPS

JOB POSTINGS

CONTACT

Author: torontoai

[D] How to handle variable input lenghts for tansformer encoder?

[D] Using ML for a Graduate Project

[D] What do you do when your models are training?

[D] Interesting papers for RNN problem / Time-Series prediction with largely _known_ underlying actuators?

[R] Is this NAS method beating EfficientNet in accuracy vs latency/FLOPs tradeoff? Once for All: Train One Network and Specialize it for Efficient Deployment

NVIDIA Clara Federated Learning to Deliver AI to Hospitals While Protecting Patient Data

Federated Learning — AI with Privacy

US and UK Lead the Way

Making Everything Smart in the Hospital

NVIDIA at RSNA 2019

[D] How ICLR review scores changed during the discussion period

VP of Engineering – Intelletec – Toronto, ON

[R] Filter Response Normalization Layer: Eliminating Batch Dependence in the Training of Deep Neural Networks

[Research] STFT within neural network pipeline