Category: Reddit MachineLearning

[D] Useful tools to help visualize matching data across multiple files or tables?

Written on October 22, 2019. Posted in Reddit MachineLearning.

I’m in the process of trying to get a handle on some datasets. I know there are identical entries spread across several files, but I’d like to find a way to visualize those connections, either in a map or even just a table.

My immediate task just has three smallish CSV’s so I could easily write an R script to pull out the matches, but I’d prefer to more visual tool that can operate across larger bodies of data.

I remember seeing a Defcon presentation where a similar tool was described for matching metadata, so I’m going back through old videos to try and find that, but I’m hoping someone here might know some good suggestions.

Thanks!

submitted by /u/QuerulousPanda
[link] [comments]

[D] Ideas and advice how to improve accuracy score using Random Forest and Extra Trees classifier.

Written on October 22, 2019. Posted in Reddit MachineLearning.

My project is classification of ultrasound 2D images, the size of the full data set is approximately 1000 images. For this analysis 250 features were handcrafted by calculating different parameters of the whole images, or horizontal slices of the images. For features selection Kbest with chi2 is used to select the best 50 features. To calculate balanced accuracy I am using sklearn.model_selection.cross_val_score, and Random Forest and Extra Trees (1000 trees). What confuses me is that when I split the data with train_test_split randomly with 9:1 ratio, and use cross_val_score only on 90% of the data the highest accuracy score is 80% with random forest, and 85% with extra trees. But when I don’t apply train_test_split and calculate balanced accuracy score on the full data set, the highest score is not higher than 60%. I expected to get better results when I included more data, but opposite happened. I would appreciate any advice or idea, how to improve the accuracy score.

submitted by /u/glitchdot2
[link] [comments]

[D] Published Machine Learning Papers that made poor assumptions/judgements

Written on October 22, 2019. Posted in Reddit MachineLearning.

I’m currently taking a Machine Learning class and need to create a critique of a published work. I was looking for some published papers in the field that weren’t really based on a valid foundation and/or made poor decisions in designing their model. Any suggestions or links would be appreciated.

submitted by /u/TheSerialTaco
[link] [comments]

[P] Quantum optical neural networks

Written on October 22, 2019. Posted in Reddit MachineLearning.

Nanophotonic neural networks are an exciting emerging technology which promises low-energy, ultra high-throughput machine learning systems implemented purely optically. Our lab has previously done work on these devices, and our new paper which extends programmable photonics to the quantum domain is now on arXiv!

In this paper, we describe a photonic architecture for a quantum programmable gate array (QPGA) which can be dynamically reprogrammed to perform any quantum computation. We show how to exactly prepare arbitrary quantum states and operators on the device, and we apply machine learning techniques to automatically implement highly compact approximations to important quantum circuits.

Below is an animation of a simulated QPGA being trained to implement a quantum Fourier transform on five qubits. Supplementary materials and the TensorFlow code for the quantum circuit optimization section of the paper can be found in the GitHub repository for the paper.

Paper: arxiv.org/abs/1910.10141

GitHub repo: github.com/fancompute/qpga

Simulated QPGA learning to implement a 5-qubit quantum Fourier transform

submitted by /u/bencbartlett
[link] [comments]

[N] ImageNet found to have questionable content (nude kids, porn stars, etc)

Written on October 22, 2019. Posted in Reddit MachineLearning.

https://www.theregister.co.uk/2019/10/23/top_ai_dataset_imagenet/

The labels within are claimed to be biased and sometimes racist… TIL ImageNet has been unavailable for download since January because of this problem, I wonder if folks are working on a cleaner version of the dataset.

submitted by /u/Mister_Abc
[link] [comments]

[D] Are there any examples of CNTK linear regression with more than 1 parameter? C# / .NET

Written on October 22, 2019. Posted in Reddit MachineLearning.

Is this even a thing when the formula for a linear function is only y=kx+d? If so, are there any other models in CNTK that are able to find variables in correlation with input values?
I have like 2-3 input values and 1 output.

submitted by /u/actopozipc
[link] [comments]

[R]Research Guide: Image Quality Assessment for Deep Learning

Written on October 22, 2019. Posted in Reddit MachineLearning.

The quality of images is relevant in building compression and image enhancement algorithms. Image Quality Assessment (IQA) is divided into two main areas; reference-based evaluation and no-reference evaluation.

In this guide, we’ll look at how deep learning has been used in image quality analysis.

https://heartbeat.fritz.ai/research-guide-image-quality-assessment-c4fdf247bf89

submitted by /u/mwitiderrick
[link] [comments]

[R] Microsoft Research Face Swapping/deepfake + Hair (CVPR 2019)

Written on October 22, 2019. Posted in Reddit MachineLearning.

pdf: http://openaccess.thecvf.com/content_CVPR_2019/papers/Gu_Mask-Guided_Portrait_Editing_With_Conditional_GANs_CVPR_2019_paper.pdf

github repo: https://github.com/cientgu/Mask_Guided_Portrait_Editing

https://i.redd.it/o895carvnau31.png

submitted by /u/PuzzledProgrammer3
[link] [comments]

[D] Generative Tensorial Reinforcement Learning for medical applications [live session]

Written on October 22, 2019. Posted in Reddit MachineLearning.

we are hosting a live session with the authors of the recent Nature paper that used generative tensorial RL for medical applications at noon EST today; watch the live session (or the recording afterwards) here: https://aisc.ai.science/events/2019-10-23

submitted by /u/tdls_to
[link] [comments]

[D] Feature Loss vs. GANs – what are the trade offs?

Written on October 22, 2019. Posted in Reddit MachineLearning.

I’m doing a bit of reading on the speech enhancement problem, where you have an audio signal containing human speech plus some noise, and you want extract just the human speech. It’s pretty analogous to image denoising or “super-resolution”, and a lot of the techniques from the image domain are being borrowed and re-applied to audio quite successfully (eg. repurposing the U-Net architecture from image processing to spectrograms and then raw audio). It’s all pretty cool.

There’s some interesting work being done with loss functions this space and I’m looking for some clarification as to why you’d choose one approach over another. You want to compare a target image, or audio waveform, with a predicted sample, and you need to define a loss function which measures how “close” they are. The Related work – Loss functions (1.1.3) section of this paper gives a pretty good overview of the different approaches, which I’ll try to summarize here.

Mean squared error loss: A pretty standard regression loss as far as I know, but it’s limited to only considering one pixel at a time: “minimizing MSE encourages finding pixel-wise averages of plausible solutions which are typically overly-smooth and thus have poor perceptual quality”.
Feature loss: This is where you pre-train a network on a similar problem, such as image classification, and then you freeze the weights. For both the target and predicted sample, you run each through the classification network, then grab some internal activations from that network and call them “features”. You compute some distance between these feature vectors to get your loss. The key idea is that the classification network is able to capture important features that MSE loss cannot (more detail here).
GAN loss: A discriminator network trains in-tandem with the generator network, where the job of the discriminator is to classify whether its input is “real” or “generated”. Like the feature loss network, it can detect features that MSE loss cannot, but it can also punish identifiable quirks of the generator network, whereas feature loss can potentially be “hacked” by the generator network.

So my questions are:

Have I characterised these approaches well?
Why would you ever choose feature loss over using a discriminator network (ie. GAN)?
- Discriminators can punish the generator for being predictably wrong (ie. common artifacts)
- Pre-trained feature loss networks may better represent image features, if they have been trained for longer, on larger data sets
- Apparently GANs can have stability issues when training
The SRGAN paper suggests using both feature loss and a GAN for their loss function – is this the best known approach?

submitted by /u/The_Amp_Walrus
[link] [comments]

Blog

Learn About Our Meetup

5000+ Members

MEETUPS

JOB POSTINGS

CONTACT

Category: Reddit MachineLearning

[D] Useful tools to help visualize matching data across multiple files or tables?

[D] Ideas and advice how to improve accuracy score using Random Forest and Extra Trees classifier.

[D] Published Machine Learning Papers that made poor assumptions/judgements

[P] Quantum optical neural networks

[N] ImageNet found to have questionable content (nude kids, porn stars, etc)

[D] Are there any examples of CNTK linear regression with more than 1 parameter? C# / .NET

[R]Research Guide: Image Quality Assessment for Deep Learning

[R] Microsoft Research Face Swapping/deepfake + Hair (CVPR 2019)

[D] Generative Tensorial Reinforcement Learning for medical applications [live session]

[D] Feature Loss vs. GANs – what are the trade offs?