Author: torontoai

[P][OC] An Empirical Comparison of Optimizers for Machine Learning Models

Written on December 3, 2019. Posted in Reddit MachineLearning.

There are so many different optimizers to choose from when doing ML projects and it was never clear to me which one should be used. So I decided to do a little research project and see for myself how they compared. I’ve written up the results here. Code: https://github.com/rickwierenga/BenchmarkingMLOptimizers

Let me know if you have any questions/thoughts.

submitted by /u/RickDeveloper
[link] [comments]

[R] Badger architecture by GoodAI

Written on December 3, 2019. Posted in Reddit MachineLearning.

Blog: https://blog.marekrosa.org/2019/12/badger.html

Paper: https://arxiv.org/abs/1912.01513

Badger = an architecture and a learning procedure where:

An agent is made up of many experts

All experts share the same communication policy (expert policy), but have different internal memory states

There are two levels of learning, an inner loop (with a communication stage) and an outer loop

Inner loop – Agent’s behavior and adaptation emerges as a result of experts communicating between each other. Experts send messages (of any complexity) to each other and update their internal memories/states based on observations/messages and their internal state from the previous time-step. Expert policy is fixed and does not change during the inner loop.

Inner loop loss need not even be a proper loss function. It can be any kind of structured feedback guiding the adaptation during the agent’s lifetime.

Outer loop – An expert policy is discovered over generations of agents, ensuring that strategies that find solutions to problems in diverse environments can quickly emerge in the inner loop.

Agent’s objective is to adapt fast to novel tasks

Exhibiting the following novel properties:

Roles of experts and connectivity among them assigned dynamically at inference time

Learned communication protocol with context-dependent messages of varied complexity

Generalizes to different numbers and types of inputs/outputs

Can be trained to handle variations in architecture during both training and testing

submitted by /u/sorrge
[link] [comments]

[D] – How to determine “value” of items in bartering system?

Written on December 3, 2019. Posted in Reddit MachineLearning.

Hey r/MachineLearning question for you guys

Given a series of bartering transactions, example below:

Item 1 was traded for Item 2 & Item 3
Item 4 was traded for Item 5 & Item 6
Item 3 was traded for Item 7
Item 7 was traded for Item 8
etc

What algorithm could you use to determine the relative value of each item?

Example from above:

Item 1 was traded for 2 & 3, this would imply that item 1 roughly equal to the value 2 & 3 combined (and more valuable than either one individually)
Item 7 was traded for Item 3 (1 for 1 trade) which would imply they have the same value

Now if I had say 300 items, and 1000s of “transaction” data points for the whole system, what’s the best way to determine the relative value of the group?

Example Questions:

How much is Item 1 (.00 – 1.00) worth compared to Item 3?
How much is Item 1 worth compared to the rest of the pack?

Bonus Points: A python package would be awesome.

submitted by /u/gkamradt
[link] [comments]

[P] A semi-automated machine learning pipeline

Written on December 3, 2019. Posted in Reddit MachineLearning.

https://github.com/ozanzgur/mlpl

This is a project that I have been working on to automate testing my experiments in data science projects. It also involves some automated experiments. However, there is still a lot of work to be done. I will be writing unit tests and making some parts simpler. It can work with tensorflow and lgbm, but some work needs to be done before that.

Would you consider using such tools in your projects? What features should I add? Any advice is greatly appreciated.

submitted by /u/mozart11111
[link] [comments]

Using Keras with Multiband TIF [D]

Written on December 3, 2019. Posted in Reddit MachineLearning.

Hey all,

Does anyone have experience feeding keras multiband TIF files? We have 7-band, 8-bit, 74×74 TIF files and for whatever reason it seems like pillow can’t recognize them. 3-band TIFs work just fine.

OSError: cannot identify image file ‘/content/drive/My Drive/raster/Twin_Cities_EW_Data/train/1/landsat_twin_cities_2000_with_no_data_values10149.TIF’

We’ve gone into the deep end, updating pillow to the most recent version. Trying all matter of TIF parameter changes. etc. etc.

Any insight is appreciated.

submitted by /u/UmnML
[link] [comments]

[P] Deploying generalizable deep learning models to production search engines

Written on December 3, 2019. Posted in Reddit MachineLearning.

We recently implemented new features such as Kubernetes support, a frontend, etc on an open source project I’ve been working on called NBoost.

So I wrote an article to talk about some of the hurdles of building a production-scale domain-specific deployment of SoTA models.

Some of the main features are:

– open-source hosting of finetuned models for domain-specific knowledge

– Kubernetes deployment via Helm

– A frontend for tracking model and network latency

submitted by /u/colethienes
[link] [comments]

Senior Data Scientist – Ontario Teachers’ Pension Plan – Toronto, ON

Written on December 3, 2019. Posted in Toronto Job Postings.

Actively participate/champion Machine Learning development life cycle. Research and develop new Machine Learning algorithms as needed.
From Ontario Teachers’ Pension Plan – Wed, 04 Dec 2019 16:34:56 GMT – View all Toronto, ON jobs

[D] What is going on here? NeurIPS 2018 paper on Automated Theorem Proving

Written on December 3, 2019. Posted in Reddit MachineLearning.

During this year, I have been talking to quite a lot of people from different research groups who said that they are working on accelerating automated theorem proving using Machine Learning. The principle idea is to guide the derivations performed by the theorem prover using some statistical model that has learned in which situation which steps lead to the completion of the proof.

Hearing this idea from multiple research groups at roughly the same time, made me interested in what progress has already been made on this topic.

I stumbled across the paper “Reinforcement Learning of Theorem Proving” (https://papers.nips.cc/paper/8098-reinforcement-learning-of-theorem-proving) presented at NeurIPS 2018. Essentially, the paper implements a simple version of the idea described above using reinforcement learning.

At first glance, it appears to present a significant improvement compared to “classical” theorem proving methods (mlCoP-vs-rlCoP). However, after more thorough research, it seems that the reported improvements are relative to the “leanCoP” algorithm, i.e., a very simple algorithm for theorem proving and by no means comparable to state-of-the-art automated theorem proofing systems.

When comparing to state-of-the-art theorem proofing algorithms, their approach significantly performs worse than these hand-tuned heuristics.

That’s when I became a bit confused.

NeurIPS is the top venue for machine learning research, so I assumed that papers presented there provide either some significant novelty in terms of learning or major advances in empirical performance.

However, it seems that this paper provides neither of those two.

As stated by the authors, the idea is a straightforward application of an AlphaGo style RL setting to the context of theorem proving. Moreover, as mentioned before, the experimental improvements are only relative to a very basic algorithm.

The reviews give some additional perspective and are accessible here (https://media.nips.cc/nipsbooks/nipspapers/paper_files/nips31/reviews/5309.html).

Reading the reviews suggests that for the initial submission, the authors did not include the comparison with state-of-the-art theorem proving systems (Vampire and E) but only the “baseline” of their basic leanCoP implementation. In other words, in their initial submission, it must have appeared to the reviewers that their method presented a significant advancement compared to existing (non-learning) approaches.

I am not saying this paper is terrible. I just had some quite high expectations before reading it, giving that it was published at NeurIPS 2018. However, I was utterly disappointed after realizing that there is virtually no contribution in terms of novel learning concepts or actual advances in automated theorem proving.

What is your perspective and opinion on this paper?

Do you know any research groups that work on automated theorem proving using machine learning?
What do you think of this paper?
Do you think this paper provides a significant contribution to our research community?
Do you think the reviewers had some mislead impressions on the paper?

submitted by /u/yusuf-bengio
[link] [comments]

[D] Decoding for the transformer in inference mode time series data

Written on December 3, 2019. Posted in Reddit MachineLearning.

With the Transformer model from “Attention is All you need” you have to feed in the the actual target during training. However, this can obviously not be done for actual inference. Now usually for inference greedy decoding or beam search is used for generating the target sequence iteratively. However, from my understanding (could be wrong) beam search and greedy decoding generally work in conjunction with a softmax function. Moreover, this is generally done over a set of vocabulary. How would we use the transformer model in inference mode for a time series forecasting task? What is the best way to generate the target values for the decoder? Could beam search still work?

submitted by /u/svpadd3
[link] [comments]

[P] How to increase the rate of network snapshots in StyleGAN?

Written on December 3, 2019. Posted in Reddit MachineLearning.

I am transferring learning from an existing model, and from my experience, there is a sweet spot of letting the model train to the data set, vs when it gets destructive to the pretrained model.

I wish to make as many network snapshots as possible,

from as far as I know, this value

network_snapshot_ticks = 1,

governs the snapshot creation. I have increased said value and also decreased it to decimal values, yet have gotten any results.

submitted by /u/SuchMore
[link] [comments]

Blog

Learn About Our Meetup

5000+ Members

MEETUPS

JOB POSTINGS

CONTACT

Author: torontoai

[P][OC] An Empirical Comparison of Optimizers for Machine Learning Models

[R] Badger architecture by GoodAI

[D] – How to determine “value” of items in bartering system?

[P] A semi-automated machine learning pipeline

Using Keras with Multiband TIF [D]

[P] Deploying generalizable deep learning models to production search engines

Senior Data Scientist – Ontario Teachers’ Pension Plan – Toronto, ON

[D] What is going on here? NeurIPS 2018 paper on Automated Theorem Proving

[D] Decoding for the transformer in inference mode time series data

[P] How to increase the rate of network snapshots in StyleGAN?