Skip to main content

Blog

Learn About Our Meetup

5000+ Members

MEETUPS

LEARN, CONNECT, SHARE

Join our meetup, learn, connect, share, and get to know your Toronto AI community. 

JOB POSTINGS

INDEED POSTINGS

Browse through the latest deep learning, ai, machine learning postings from Indeed for the GTA.

CONTACT

CONNECT WITH US

Are you looking to sponsor space, be a speaker, or volunteer, feel free to give us a shout.

Category: Reddit MachineLearning

[D] How patentable are domain-specific ML applications?

I’ve built a Mask RCNN-based tool for processing images in a specific scientific sub-domain (that has commercial applications). I was all set to submit it to The Journal of Open Source Software — however, my research group is funded by a large, deep-pocketed company that shall not be named. After months of not responding to multiple emails about the project and my intent to publish it, they’ve decided at the last minute that they want to “look into its patentability.”

What is the likelihood that this is a patentable application? (It may be relevant that I didn’t write the Mask R-CNN model from scratch — I use an implementation that is MIT-licensed.)

In general, how patentable are domain-specific ML applications (where the ML is not novel)?

submitted by /u/pedalstiffcranks
[link] [comments]

[D] Transformer predicts own input on time-series data

Hi, I’m working on a project which requires predicting the remainder of a time series given many examples. For example, given a time series from T=0 to T=m, I need to predict T=m to T=n.

I’ve trained a few different autoregressive models for the task: (1) pure decoder Transformer where I learn the joint probability over the full sequence then given the known part of the sequence I just impute the part that I’m interested in, and (2) encoder-decoder Transformer where I provide the data up to the point I’d like to predict as conditioning information then model the joint conditional probability over just the region of interest.

In both cases, I’m finding a very strong effect where the network learns to simply predict whatever the input value is at each time step (and I’ve confirmed that the labels are being passed in correctly — shifted right relative to input.) This means during inference it will always predict a straight line. In contrast with neural machine translation or language modeling tasks where the token of the next word may be very different from the previous word, with a high resolution time series the next token is always almost the input token because it’s a function. I’ve also tried a continuous version of the Transformer and it simply picks out a few common modes and predicts these each time during inference. I found I can do better in terms of RMSE and MAE by just using a fully connected network that predicts the entire region of interest simultaneously (making the assumption each point is independent of the others) which seems strange.

Does anyone have experience with a similar task and suggestions on how to handle this? I imagine using an artificially lower time resolution would make this better but that solution is rather unsatisfying.

(I’ve seen a few blog posts and a previous r/MachineLearning post about Transformers on time series data but none address this problem.)

submitted by /u/collider_in_blue
[link] [comments]

[D] Converting HuggingFace GPT2 Models to Tensorflow 1.x

> HuggingFace Transformers is a wonderful suite of tools for working with transformer models in both Tensorflow 2.x and Pytorch. However, many tools are still written against the original TF 1.x code published by OpenAI. Unfortunately, the model format is different between the TF 2.x models and the original code, which makes it difficult to use models trained on the new code with the old code. There are many tools for converting the old format to TF 2.x and Pytorch, but not vice versa. In this blog post, I will share the (frustrating) process of getting the conversion to work.

https://leogao.dev/2019/11/09/Converting-HuggingFace-GPT2-Models-to-Tensorflow-1/

https://github.com/leogao2/gpt2-hf-to-tf1

submitted by /u/leogao2
[link] [comments]

[D] Tips for a high schooler in terms of eventually getting a top program for a PhD in ML/AI?

Hello! I’m a senior who is currently applying to undergrad programs.

I was wondering if you guys have any suggestions for what programs I should apply to and what I need to do to make into a top 5 PhD program (i.e. UCB, Stanford, CMU, Caltech, MIT).

I’m mostly interested in what steps I can do as a high schooler to help increase my chances as I hear that these programs are incredibly competitive. I don’t think I will be competetive for T20 school but I hope I will get into a T50 school like UW, Colorado School of Mines, Grinell, or RPI for my undergrad (all of which I am happy with). Are there any other good programs for grad placement that I should apply to? Should I take any programs out?

Also, what should I do in order to help me later on? I’m planning on taking some AI related online courses.

Should I try to get a research position (btw, can I get this at a community college in Washington or no) or should I try to get an internship at a company like Microsoft or a startup?

My stats can be found here: https://old.reddit.com/r/chanceme/comments/dtbmqp/chance_me_grinnell_uiuc_cornell_gatech_uw_cs/

submitted by /u/Agile_Musician
[link] [comments]

[R] AllenNLP interpret: EMNLP Best Demo Paper Award

AllenNLP has won the EMNLP 2019 Best Demo Paper Award:

https://allennlp.org/interpret

From their website:

We present AllenNLP Interpret, a toolkit built on top of AllenNLP for interactive model interpretations. The toolkit makes it easy to apply gradient-based saliency maps and adversarial attacks to new models, as well as develop new interpretation methods. AllenNLP interpret contains three components: a suite of interpretation techniques applicable to most models, APIs for developing new interpretation methods (e.g., APIs to obtain input gradients), and reusable front-end components for visualizing the interpretation results.

For people doing NLP: do you feel this is going to be useful to you? If so, in which contexts?

(Disclaimer: I’m not associated with this project in any way – just curious on how to best use it)

submitted by /u/rstoj
[link] [comments]

[D] Models which try to learn operations like mirroring

Hi,

Some time back someone jested on this sub about swapping a pair of images (x, y = y, x) using a ‘model’ (I believe the context was a faceswap news) and I was wondering whether there are works that try to learn discrete-ish operations like swap, mirror etc. using a neural network (possibly just to poke around the ‘learnings’). For example, a network that learns image mirroring horizontally (third panel is prediction from a bad feed forward model trying to minimize mse on the flipped image).

Considering that data augmentation in most fields use a lot of similar operations, I tried looking around there but didn’t find anything (not sure what to search for exactly). A recent one does some sort of learned data augmentation but its output is based on these operations and not discovering them. I guess the problem doesn’t make much sense in context of models like neural nets and fits in better with a more GOFAI approach where we impose semantics and then extract rules from data. In any case, wanted to know sources that have inspected this or similar ideas.

submitted by /u/gwynbleiddeyr
[link] [comments]

[P] A Deep Dream implementation in PyTorch

Hi, I know I am kinda late for this topic but this effect still blows my mind, I have been working on an implementation in PyTorch that you can see on this repo

https://github.com/juanigp/Pytorch-Deep-Dream

And here are some images I got using my code

https://preview.redd.it/u6l5v9gotpx31.png?width=583&format=png&auto=webp&s=a80bf428d727c7eaf93f3f3e69391eee7f8d275b

https://preview.redd.it/1a3cu9gotpx31.png?width=598&format=png&auto=webp&s=4fc950267f2726ffbd6f7192d9e49185699dfbd3

Hope someone finds this useful!

submitted by /u/juanigp
[link] [comments]

[D] I am confused. It appears that the Tensor2Tensor Transformer recquires the actual target as an input?

If, say, we translate from english to german then it appears that the Decoder wants the german text as in Input before it produces an output that could be used for a loss calculation and then training.

https://github.com/tensorflow/tensor2tensor/blob/master/tensor2tensor/models/transformer.py

*The title should read ” It appears that the Tensor2Tensor Transformer Decoder recquires the actual target as an input?”, sorry

submitted by /u/ReasonablyBadass
[link] [comments]