Category: Reddit MachineLearning

[D] Jurgen Schmidhuber on Alexey Ivakhnenko, godfather of deep learning 1965

Written on December 19, 2019. Posted in Reddit MachineLearning.

Jurgen’s famous blog post on their miraculous year mentions Alexey Grigorevich Ivakhnenko several times and links to another page which states

In 1965, Ivakhnenko and Lapa [71] published the first general, working learning algorithm for supervised deep feedforward multilayer perceptrons [A0] with arbitrarily many layers of neuron-like elements, using nonlinear activation functions based on additions (i.e., linear perceptrons) and multiplications (i.e., gates). They incrementally trained and pruned their network layer by layer to learn internal representations, using regression and a separate validation set. (They did not call this a neural network, but that’s what it was.) For example, Ivakhnenko’s 1971 paper 72 already described a deep learning net with 8 layers, trained by their highly cited method (the “Group Method of Data Handling”) which was still popular in the new millennium, especially in Eastern Europe, where much of Machine Learning was born.

That is, Minsky & Papert’s later 1969 book about the limitations of shallow nets with a single layer (“Perceptrons”) addressed a “problem” that had already been solved for 4 years 🙂 Maybe Minsky did not even know, but he should have. Some claim that Minsky’s book killed NN-related research, but of course it didn’t, at least not outside the US.

his scholarpedia article on deep learning says

Like later deep NNs, Ivakhnenko’s nets learned to create hierarchical, distributed, internal representations of incoming data.

and his blog says

In surveys from the Anglosphere it does not always become clear [DLC] that Deep Learning was invented where English is not an official language. It started in 1965 in the Ukraine (back then the USSR) with the first nets of arbitrary depth that really learned

the link in the quote is Jurgen’s famous critique of Yann & Yoshua & Geoff who failed to cite Ivakhnenko, although they should have known his work which was prominently featured in Juergen’s earlier deep learning survey, it looks as if they wanted to credit Geoff for learning internal representations, although Ivakhnenko & Lapa did this 20 years earlier, Geoff’s 2006 paper on layer-wise training in deep belief networks also did not cite Ivakhnenko’s layer-wise training, neither did Yoshua’s deep learning book, how crazy is that, a book that fails to mention the very inventors of its very topic

I also saw several recent papers on pruning deep networks, but few cite Ivakhnenko & Lapa who did this first, I bet this will change, science is self-healing

notably, Ivakhnenko did not use backpropagation but regression to adjust the weights layer by layer, both for linear units and for “gates” with polynomial activation functions

Five years later, modern backpropagation was published “next door” in Finland

we already had a reddit discussion on Seppo Linnainmaa, inventor of backpropagation in 1970

anyway, all hail Alexey Ivakhnenko and Valentin Lapa who had the first deep learning feedforward networks with many hidden layers, too bad they are dead, so no award for that

submitted by /u/siddarth2947
[link] [comments]

[P] 3D Terrain GAN

Written on December 19, 2019. Posted in Reddit MachineLearning.

Trained a GAN to generate images of realistic terrain and associated height maps for rendering in 3D based on a user created segmentation map. Details, code, dataset on my git:

https://github.com/tpapp157/SPADE-Terrain-GAN

The GAN itself uses the NVidia SPADE network as a start point and makes quite a few modifications to better suit the application. Next steps, I’d like to explore incorporating some of the features of the recent Style-GAN 2. Open to any thoughts or suggestions.

submitted by /u/tpapp157
[link] [comments]

[D] ICLR 2020 REJECTION RAGE THREAD

Written on December 19, 2019. Posted in Reddit MachineLearning.

CAPS ONLY

PEOPLE WITH ACCEPTED PAPERS ARE NOT WELCOME

submitted by /u/sensei_von_bonzai
[link] [comments]

[D] Evolution Simulator

Written on December 19, 2019. Posted in Reddit MachineLearning.

I am new to machine learning, but I have taken a quite a few linear algebra and statistics courses. I am going to take a ML course next semester and I hope to finish a ML personal project next year.

Is this project idea feasible? How should I get started on it? Any suggestions?

– I will work the Unity engine

– Random environment (with randomized food sources and terrain) and creatures will be generated at the beginning of the simulation according to user’s inputs

– Creatures survive by eating other creatures or plants. They die if they starve or get old enough or get eaten by other creatures.

– Surviving creatures will reproduce the next generation. Mutation like small structural changes, HP, size and speed might happen here.

submitted by /u/jamesgz
[link] [comments]

[R] How to use tf’s KFAC package to set up “Ab-Initio Solution of the Many-Electron Schrödinger Equation with Deep Neural Networks”‘s optimization procedure?

Written on December 19, 2019. Posted in Reddit MachineLearning.

I have a question about the usage of tf’s kfac package in this paper (https://arxiv.org/abs/1909.02487).

How to register custom loss with LayerCollection object and what format is expected of this custom loss?

And it is not clear for me how to register custom layers (such as determinant layer).

Any help would be greatly appreciated 🙂

submitted by /u/TotoroPet
[link] [comments]

[R] Robots make hot dogs with RL and Formal Methods

Written on December 18, 2019. Posted in Reddit MachineLearning.

Paper in Science Robotics using Formal Methods to shape reward functions for RL that are dense and broken into subtasks to improve learning!

https://robotics.sciencemag.org/content/4/37/eaay6276

submitted by /u/zserlin
[link] [comments]

[P] How to structure InfoGAN with the control variables conditioned on known external parameters?

Written on December 18, 2019. Posted in Reddit MachineLearning.

I have a large set of simulation data that was generated using a Monte Carlo procedure that samples a distribution parameterized by (h, t). I want to use an InfoGAN model to provide unsupervised predictions classifications c of the simulation samples.

Now, it would be easy enough to just use the standard InfoGAN structure to do the unsupervised classification without using the external parameters (h, t), but I want the neural network to learn how c depends on (h, t). I would think that simply providing (h, t) as additional inputs to the generator and the discriminator will not provide the best results considering that for some parameters h_i and t_j that c_k is not guaranteed to exist. So, I cannot just independently sample h, t, and c when providing input for the generator. Instead, I would like InfoGAN to learn and optimize the I(c(h, t); G(z, c(h, t))) instead of I(c; G(z, c)) where c(h, t) is not known a priori.

In short, I suppose I am looking for a way to combine the unsupervised classification of InfoGAN while also providing additional conditional information as with CGAN in a way such that the relationship between the classification and the conditional information is learned.

submitted by /u/sifodeas
[link] [comments]

[P] Hierarchical self-organizing maps for unsupervised pattern recognition

Written on December 18, 2019. Posted in Reddit MachineLearning.

From the project on GitHub:

A hierarchical self-organizing map (HSOM) is an unsupervised neural network that learns patterns from high-dimensional space and represents them in lower dimensions.

HSOM networks recieve inputs and feed them into a set of self-organizing maps, each learning individual features of the input space. These maps produce sparse output vectors with only the most responsive nodes activating, a result of competitive inhibition which restricts the number of ‘winners’ (i.e. active nodes) allowed at any given time.

Each layer in an HSOM network contains a set of maps that view part of the input space and generate sparse output vectors, which together form the input for the next layer in the hierarchy. Information becomes increasingly abstract as it is passed through the network and ultimately results in a low-dimensional sparse representation of the original data.

The training process results in a model that maps certain input patterns to certain labels, corresponding to high-dimensional and low-dimensional data respectively. Given that training is unsupervised, the labels have no intrinsic meaning but rather become meaningful through their repeated association with certain input patterns and their relative lack of association with others. Put simply, labels come to represent higher-dimensional patterns over time, allowing them to be distinguished from one another in a meaningful way.

submitted by /u/sterntree
[link] [comments]

[D] ICLR 2020 Paper Acceptance Result

Written on December 18, 2019. Posted in Reddit MachineLearning.

ICLR 2020 paper acceptance results are supposed to be released on 19th December 2019. Creating a discussion thread for this year’s results.

submitted by /u/zy415
[link] [comments]

[D] Can ML models output ideas and concepts?

Written on December 18, 2019. Posted in Reddit MachineLearning.

I have very limited experience with ML, so apologies if this question is silly or too abstract.

Many problems (e.g. abstractive text summarization) do not have well-developed and robust solutions yet, even though the progress is reasonably fast and there is a lot of theoretical research that allows us to move forward.

For example, developments in the area of word embeddings have made a huge impact on the quality of text processing models. We have come a long way from the simplest Bag-of-Words model to more modern Word2Vec variants to Glove and FastText. Thanks to these developments, we are able to train models which successfully catch on the semantics of text. However, this required decades of research, which is a long time.

This kind of research yields new ideas and concepts, not just results of computations or definite answers to specific questions. This applies to any area (biology, chemistry, physics), not just text processing.

So, my basic question is, could we make a computer research this kind of problem instead of spending the time of actual humans? I’m not even sure if this lies in the realm of ML, but it doesn’t seem as hard as creating a “true AI”, because such “machine thinker” would only need knowledge of some subject area, not a complete memory of an adult human.

Basically, can we create ML models which output ideas and concepts as opposed to specific answers to classification or prediction problems? E.g. can we have a computer “invent” the next approach to word embeddings (better than current state-of-the-art) faster than the human researchers will?

It’s not even necessary that the resulting approach is understood by humans, it just needs to be implementable.

I see a lot of unsolved problems here (how do we formalize ideas to make the machine process them? where do we get training datasets with “good” and “bad” ideas?), but is there any research at all into this sort of thing?

Thanks!

P.S. Let’s keep jokes about AI apocalypse out of this

submitted by /u/smthamazing
[link] [comments]

Blog

Learn About Our Meetup

5000+ Members

MEETUPS

JOB POSTINGS

CONTACT

Category: Reddit MachineLearning

[D] Jurgen Schmidhuber on Alexey Ivakhnenko, godfather of deep learning 1965

[P] 3D Terrain GAN

[D] ICLR 2020 REJECTION RAGE THREAD

[D] Evolution Simulator

[R] How to use tf’s KFAC package to set up “Ab-Initio Solution of the Many-Electron Schrödinger Equation with Deep Neural Networks”‘s optimization procedure?

[R] Robots make hot dogs with RL and Formal Methods

[P] How to structure InfoGAN with the control variables conditioned on known external parameters?

[P] Hierarchical self-organizing maps for unsupervised pattern recognition

[D] ICLR 2020 Paper Acceptance Result

[D] Can ML models output ideas and concepts?