Skip to main content

Blog

Learn About Our Meetup

5000+ Members

MEETUPS

LEARN, CONNECT, SHARE

Join our meetup, learn, connect, share, and get to know your Toronto AI community. 

JOB POSTINGS

INDEED POSTINGS

Browse through the latest deep learning, ai, machine learning postings from Indeed for the GTA.

CONTACT

CONNECT WITH US

Are you looking to sponsor space, be a speaker, or volunteer, feel free to give us a shout.

Author: torontoai

[D] Kernel functions and neural networks

I’ve been pondering this question and wanted to get some of your thoughts on it.

Kernel functions finds distances between two inputs relative to each other in some transformed space. Neural networks on the other hand finds the exact location of of the input in its transformed space. Are there benefit and downsides between the two transformations? Why are kernel functions used instead of specifying the direct transformation from input to transformed space

submitted by /u/dramanautica
[link] [comments]

[P] MelGAN vocoder implementation in PyTorch

[P] MelGAN vocoder implementation in PyTorch

Disclaimer: This is a third-party implementation. The original authors stated that they will be releasing code soon.

A recent research showed that fully-convolutional GAN called MelGAN can invert mel-spectrogram into raw audio in non-autoregressive manner. They showed that their MelGAN is lighter & faster than WaveGlow, and even can generalize to unseen speakers when trained on 3 male + 3 female speakers’ speech.

I thought this is a major breakthrough in TTS reserach, since both researchers and engineers can benefit from this fast & lightweight neural vocoder. So I’ve tried to implement this is PyTorch: see GitHub link w/ audio samples below.

Debugging was quite painful while implementing this. Changing the update order of G/D mattered much, and my generator’s loss curve is still going up. (Though results looks good when compared to original paper’s.)

Figure 1 from “MelGAN: Generative Adversarial Networks for Conditional Waveform Synthesis”

submitted by /u/seungwonpark
[link] [comments]

[D] Overfitting vs. Generalization – a subtle difference

In my view, overfitting does not necessarily imply lack of generalization, just as well as generalization cannot be directly associated to degree of overfitting.

An overfit model is a model that is tuned to generate the highest performance (e.g. lowest loss) on the dataset it was trained with. This can be tested by the difference between the losses on the validation set and on the training set. In order to test for overfitting, training and validation sets should have similar distributions. If that’s the case, an overfit model will deviate in performance on the validation set from the training performance. This is because, even if the distributions are similar, the model is tuned to pick up correctly only the samples it has seen on the training set.

As for generalization, it can only be evaluated between datasets (test and training) that have different distributions. Ideally, the test distribution will be the most heterogeneous of them all. In my opinion, this is the only way to really assess generalization: the difference between the losses on training versus testing set.

TLDR: Overfitting is indicated by when model underperforms on unseen data with similar distributions to seen data. Generalization, on the other hand, is indicated by the performance differences between seen and unseen data with different distributions, where the unseen data ideally represents real world distributions.

I think this is a misconception most have, even in industry.

What are your thoughts?

submitted by /u/eigenlaplace
[link] [comments]

[D] Useful tools to help visualize matching data across multiple files or tables?

I’m in the process of trying to get a handle on some datasets. I know there are identical entries spread across several files, but I’d like to find a way to visualize those connections, either in a map or even just a table.

My immediate task just has three smallish CSV’s so I could easily write an R script to pull out the matches, but I’d prefer to more visual tool that can operate across larger bodies of data.

I remember seeing a Defcon presentation where a similar tool was described for matching metadata, so I’m going back through old videos to try and find that, but I’m hoping someone here might know some good suggestions.

Thanks!

submitted by /u/QuerulousPanda
[link] [comments]

Keys to the (Smart) City: NVIDIA Powers Mini AI Metropolis at MWC Los Angeles

This neighborhood has shopping, cafes, even a place to go to school. But it happens to fit nearly inside a 2,000 square-foot trade show booth, showing off smart city technology.

With a miniature town erected on the show floor, NVIDIA welcomed more than 22,000 telecom industry professionals attending the Mobile World Congress in Los Angeles.

Crowds jammed into the booth to see how pervasive AI and connectivity can elevate experiences in the world around us. While most cities are powered by elaborate power and water grids, this one’s infrastructure is built on NVIDIA AI technologies.

They include the just-introduced NVIDIA EGX Edge Supercomputer Platform, the NVIDIA Metropolis smart city developer kit, and NVIDIA Xavier, the world’s most powerful system-on-a-chip, and a multitude of others.

A Vibrant Ecosystem

Front and center, a huge display at the front of the booth monitoring the heart of our virtual city told the stories of a diverse array of NVIDIA EGX-powered technologies.

Qwake Technologies, for example, creates augmented reality maps to guide firefighters. Volvo Trucks has built Vera, the first cabin-less autonomous truck to move cargo. Blue River Technologies is using AI to apply tiny doses of pesticides with incredible precision.

Shopping Spree

Steps away, showgoers could stroll into a convenience store stocked with everything from Fuji spring water to KitKats to Chex Mix. Thanks to startup AIFi’s EGX-powered systems, customers could simply grab what they needed and go, and get invisibly charged to the sale.

Another startup, AnyVision, showed how it’s using EGX to do real-time analytics of customer’s shopping behaviors, giving real-world stores the same kind of insights into shopping behavior long enjoyed by online ones.

Nearby, Malong Technologies showed how its GPU-powered system allows shoppers to grab a bunch of grapes or a banana, and have the checkout system instantly recognize it — no bar codes needed.

Around the corner, food delivery service Postmates showed off what it describes as the first socially aware food delivery robot. Its diminutive yellow robot, equipped with powerful lidar sensors and a playful digital face, is powered by NVIDIA EGX servers running in a data center, Xavier, and NVIDIA JetPack software developer kit.

No modern city is complete without a gaming café. The NVIDIA Edge Cafe’s games, however, are hosted on a data center miles away and beamed to devices over Wi-Fi and Verizon’s 5G network.

The result: cheap, light laptops and smartphones equipped with game controllers were able to play the latest games in stunning high-definition quality at 60 frames per second.

Gawking at an Invisible Car

Of course no great street scene — or trade show booth — is complete without a car. In a witty twist, this car’s invisible until you pick up an ordinary smartphone.

Looking through the phone’s screen, you can check out a million-dollar, cherry-red McLaren Senna sports coupe mounted on a pedestal at the front of the booth.

“Okay, so that was pretty cool,” said Danny Miller, a car aficionado who works in sales and marketing for a media company after taking a long look at the curvaceous virtual coupe.

The demo relied on the NVIDIA CloudXR software developer kit, which lets enterprises deliver virtual and augmented reality experiences across 5G networks to let showgoers see a virtual car created out of 28 million polygons and running an NVIDIA Quadro RTX 8000 GPU.

No City’s Complete Without Top-Rated Schools

This city is even equipped with not one, but two places where you can go to school to learn more.

The NVIDIA theater features speakers from around the industry — like Kundana Palagiri, principal program manager for Microsoft Azure; and Usman Sadiq, deep learning product manager at Cisco — who shared their real-world experience with scores of listeners.

Around the corner, NVIDIA’s Deep Learning Institute had set up a bank of 15 laptops for  hands-on training in AI and accelerated computing to solve real-world problems to developers, data scientists, researchers and students led by expert instructors.

Attendees from marketing, security and customer service companies — among others — inspired by what they’ve heard at the show, signed up for hands-on training.

Join the Crowd

This tiny city, in short, has almost anything you could need. The only downside: like any bustling city, there’s plenty of traffic. All of it, in this case, on foot.

Scores of attendees crowded into the booth to gawk, snap photos and grab black-shirted NVIDIA employees to ask questions and exchange business cards.

In town? Stop by our town at booth 1745 in the South Hall of the Los Angeles Convention Center.

The post Keys to the (Smart) City: NVIDIA Powers Mini AI Metropolis at MWC Los Angeles appeared first on The Official NVIDIA Blog.

[D] Ideas and advice how to improve accuracy score using Random Forest and Extra Trees classifier.

My project is classification of ultrasound 2D images, the size of the full data set is approximately 1000 images. For this analysis 250 features were handcrafted by calculating different parameters of the whole images, or horizontal slices of the images. For features selection Kbest with chi2 is used to select the best 50 features. To calculate balanced accuracy I am using sklearn.model_selection.cross_val_score, and Random Forest and Extra Trees (1000 trees). What confuses me is that when I split the data with train_test_split randomly with 9:1 ratio, and use cross_val_score only on 90% of the data the highest accuracy score is 80% with random forest, and 85% with extra trees. But when I don’t apply train_test_split and calculate balanced accuracy score on the full data set, the highest score is not higher than 60%. I expected to get better results when I included more data, but opposite happened. I would appreciate any advice or idea, how to improve the accuracy score.

submitted by /u/glitchdot2
[link] [comments]

US Spanish and Brazilian Portuguese neural voices join Amazon Polly

Amazon Polly turns text into lifelike speech. In July 2019, AWS launched eight US English and three UK English voices in Neural Text-to-Speech (NTTS) technology, which delivers ground-breaking improvements in speech quality through a new machine learning approach. Polly is now adding the first non-English NTTS voices, in US Spanish and Brazilian Portuguese. Introducing Lupe and Camila!

Why US Spanish?

There are an estimated 59.8 million Hispanic people in the United States. (This figure comes from the US Census annual estimate as of July 1, 2018.) Companies that provide engaging online content to their Hispanic audience set themselves up for success. The new US Spanish voice, Lupe, joins this trend. After Miguel and Penelope, it is the third US Spanish TTS voice in the Amazon Polly portfolio. Lupe offers a human-like quality with enhanced intonation, especially when listening to the neural version of the voice. Lupe not only speaks Spanish but also handles English very well; it provides a fully bilingual Spanish-English experience. All this thanks to an extended phoneme coverage, comprised of 72 English and Spanish phoneme variants. In contrast, the phone set for Penélope and Miguel contains only 29 Spanish phonemes.

 

Listen now

Voiced by Amazon Polly

Listen now

Voiced by Amazon Polly

Why Brazilian Portuguese?

Camila, the new Brazilian Portuguese TTS voice, supports customers whose priority is to provide best-in-class TTS voices for their Brazilian Portuguese-speaking audience. Similar to Lupe, Camila is a natural-sounding TTS voice that demonstrates a high prosodic quality. The synthesis generated by this voice is smooth and clear, which makes Camila a pleasant voice to listen to. Amazon Polly customers can now enjoy a selection of three Brazilian Portuguese voices: Ricardo, Vitória, and Camila.

 

Listen now

Voiced by Amazon Polly

Listen now

Voiced by Amazon Polly

The neural versions of Camila and Lupe are the first two non-English NTTS voices that Amazon Polly offers, and are available in US East (N. Virginia), US West (Oregon), and EU (Ireland) Regions. Standard versions of these voices are also available across 18 AWS Regions.

Amazon Polly now offers a selection of 61 voices across 29 languages. Of these, thirteen voices in four languages are available in both standard and neural technology.

Try these new voices and experience for yourself the natural-sounding NTTS technology powering Camila and Lupe.

 


About the Author

Marta Smolarek is a Program Manager in the Amazon Text-to-Speech team. At work she connects the dots. In her spare time, she loves to go camping with her family.

 

 

 

 

[P] Quantum optical neural networks

Nanophotonic neural networks are an exciting emerging technology which promises low-energy, ultra high-throughput machine learning systems implemented purely optically. Our lab has previously done work on these devices, and our new paper which extends programmable photonics to the quantum domain is now on arXiv!

In this paper, we describe a photonic architecture for a quantum programmable gate array (QPGA) which can be dynamically reprogrammed to perform any quantum computation. We show how to exactly prepare arbitrary quantum states and operators on the device, and we apply machine learning techniques to automatically implement highly compact approximations to important quantum circuits.

Below is an animation of a simulated QPGA being trained to implement a quantum Fourier transform on five qubits. Supplementary materials and the TensorFlow code for the quantum circuit optimization section of the paper can be found in the GitHub repository for the paper.

Paper: arxiv.org/abs/1910.10141

GitHub repo: github.com/fancompute/qpga

Simulated QPGA learning to implement a 5-qubit quantum Fourier transform

submitted by /u/bencbartlett
[link] [comments]