Author: torontoai
[R] DCTD: Deep Conditional Target Densities for Accurate Regression
We propose Deep Conditional Target Densities (DCTD), a novel and general regression method with a clear probabilistic interpretation. DCTD models the conditional target density p(y|x) by using a neural network to directly predict the un-normalized density from the input-target pair (x, y). This model of p(y|x) is trained by minimizing the associated negative log-likelihood, approximated using Monte Carlo sampling. Notably, our method achieves a 1.9% AP improvement over Faster-RCNN for object detection on COCO, and sets a new state-of-the-art on visual tracking when applied for bounding box regression.
arXiv: https://arxiv.org/abs/1909.12297
Project page: http://www.fregu856.com/publication/dctd/
submitted by /u/dirac-hatt
[link] [comments]
Data Scientist II, Machine Learning Model Validation – TD Bank – Toronto, ON
From TD Bank – Fri, 27 Sep 2019 02:48:02 GMT – View all Toronto, ON jobs
Data Scientist I, Machine Learning Model Validation – TD Bank – Toronto, ON
From TD Bank – Fri, 27 Sep 2019 02:47:46 GMT – View all Toronto, ON jobs
[D] No Free Lunch theorems do not compare functions that can utilize cost-information from partial solutions. So why care about NFL?
I see No Free Lunch theorems discussed enough that I decided to check my understanding, and sit down with the original paper.
They prove bold (but contextualized) claims, and I feel like the bold claims have really taken on a life of their own (absent context):
one might expect that hill climbing usually outperforms hill descending if one’s goal is to find a maximum […] such expectations are incorrect
the average performance of any pair of algorithms across all possible problems is identical
Very interesting, to be sure. But this all hinges on a specific assumption:
[…] our decision to only measure distinct [oracle] function evaluations
meaning:
techniques like branch and bound are not included since they rely explicitly on the cost structure of partial solutions.
I think their framework is interesting and useful for describing algorithms like Simulated Annealing or Genetic Algorithms.
But since it doesn’t apply to an entire class of algorithms (those that can reason from partial solutions), it seems to me that we should really reign in our claims about NFL.
I must be missing something.
submitted by /u/BayesMind
[link] [comments]
[R] Introducing the CodeSearchNet challenge
“Searching for code to reuse, call into, or to see how others handle a problem is one of the most common tasks in a software developer’s day. However, search engines for code are often frustrating and never fully understand what we want, unlike regular web search engines. We started using modern machine learning techniques to improve code search but quickly realized that we were unable to measure our progress. Unlike natural language processing with GLUE benchmarks, there is no standard dataset suitable for code search evaluation.
With our partners from Weights & Biases, today we’re announcing the CodeSearchNet Challenge evaluation environment and leaderboard. We’re also releasing a large dataset to help data scientists build models for this task, as well as several baseline models showing the current state of the art. Our leaderboard uses an annotated dataset of queries to evaluate the quality of code search tools.”
submitted by /u/youali
[link] [comments]
[D] Why do effective activation functions have a bounded derivative?
Is there a reason why almost every modern activation function in deep learning has a bounded derivative? ReLU, Swish, tanh, sigmoid and other activation functions mentioned here all have bounded derivatives.
My intuition says it is because we use backprop to train our networks. A bounded derivative should restrict the amount of gradient flow during the backward phase, preventing a blowup of gradients. What do you guys think?
submitted by /u/TheSilenceOfTheBakra
[link] [comments]
[P] Having a predefined questionnaire, how to write system to extract data.
There is an extremely inefficient process in my city office. There is a process of collecting a data from citizens each year, there is an online form and offline/paper form. The paper is a problem:
- The forms are given to the people.
- People fill the forms, it’s handwriting, and return it to the office.
- The clerks have about 2-4 weeks to type the forms into the system.
- There is a control data in the form, if incorrect, the form is ignored in further processing.
There are about 15-25K paper forms each year, the graphics and content changes yearly.
I have a template of this year’ form. It’s one page A4. There are two types of information we want to extract: small boxes for a single digit and free text boxes (can contain any text). I don’t have samples of data, but can generate few.
The forms contain sensitive data, cannot be processed outside the internal network. How would you approach such a problem? I would appreciate any help.
Usually I would just go with Google Vision API and text extraction and later writing decision tree to classify bounding boxes as a pieces of information, but in this case I cannot use external services.
This is a non-profit project. If I cannot solve it, they will just hand type it.
submitted by /u/janiedebica
[link] [comments]
[R] Deep Learning For Symbolic Mathematics (ICLR 2020 submission)
TL;DR: We train a neural network to compute function integrals, and to solve complex differential equations.
Abstract: Neural networks have a reputation for being better at solving statistical or approximate problems than at performing calculations or working with symbolic data. In this paper, we show that they can be surprisingly good at more elaborated tasks in mathematics, such as symbolic integration and solving differential equations. We propose a syntax for representing these mathematical problems, and methods for generating large datasets that can be used to train sequence-to-sequence models. We achieve results that outperform commercial Computer Algebra Systems such as Matlab or Mathematica. Keywords: symbolic, math, deep learning, transformers
https://openreview.net/forum?id=S1eZYeHFDS¬eId=S1eZYeHFDS
submitted by /u/youali
[link] [comments]