Skip to main content

Blog

Learn About Our Meetup

5000+ Members

MEETUPS

LEARN, CONNECT, SHARE

Join our meetup, learn, connect, share, and get to know your Toronto AI community. 

JOB POSTINGS

INDEED POSTINGS

Browse through the latest deep learning, ai, machine learning postings from Indeed for the GTA.

CONTACT

CONNECT WITH US

Are you looking to sponsor space, be a speaker, or volunteer, feel free to give us a shout.

Category: Reddit MachineLearning

[D] MCTS on raw network not trained with MCTS

In the AlphaGo Zero paper figure 6b shows the performance of a raw network which directly takes the action with the highest q-value(?) versus an MCTS approach which gets 5 seconds of thinking time. The MCTS approach has a large performance gain over the raw network approach.

Now I have trained a network with a policy and value head that uses the first approach and does not have a tree structure with accompanying data (such as times visited per node). I’m wondering if I can skip training using MCTS but just use the network to build a tree in the simulation phase and if there’s any precedent for this technique.

The problem is a deterministic RL problem with only one goal state and no other rewards. The same state can be reached twice and this often happens when I use the raw network approach. The agent then gets stuck in a loop. In a previous post, someone suggested taking the next best option once a certain state is reached more than once. This worked like a charm. But for real-world application, I would like to keep the number of actions taken as low as possible. This is why I think MCTS mightbe an improvement.

submitted by /u/matigekunst
[link] [comments]

[P] Abusing text synthesis into SVG image generation for fun and art

[P] Abusing text synthesis into SVG image generation for fun and art

Released a little utility earlier this year where you can feed raw SVG files as text into your favorite text synthesis engine, such as charRNN, and then attempt to fix the resulting output(s) back into a valid SVG file.

https://github.com/artBoffin/GAN-XML-Fixer

The philosophy behind this method is not about accuracy, but more about discovery.
Anyone else have ideas of how to creatively “bend” some ML systems outside their intended use?

Some examples of art prints I’ve made using the utility (some are assembled in groups / modified)

https://i.redd.it/suzvevk2l6f31.png

https://i.redd.it/u50ypxk2l6f31.png

https://i.redd.it/axrffzk2l6f31.png

https://i.redd.it/6kj6hyk2l6f31.png

LEGO Minifigs

https://i.redd.it/lgk5vrn4l6f31.png

https://i.redd.it/r9o41go4l6f31.png

https://i.redd.it/n6awxrn4l6f31.png

https://i.redd.it/26z9orn4l6f31.png

submitted by /u/shoeblade
[link] [comments]

[D] Keras vs tensorflow: Performance, GPU utilization and data pipeline

Hi folks,

I was recently dealing with some performance issues related to the keras image preprocessing. After several experiments, I thought it would might be helpful to share my insights. There are several possible fixes:

  • update all packages, especially keras-preprocessing.
  • Deactivate your virus scanner (whitelist your data folder) and check if you have an internal SSD.
  • Try to tweak the configuration on fit_generator (workers and queue_size). If you are using linux try out multiprocessing and a thread-safe generator.
  • Convert your dataset to TFrecords and use it with keras or directly move to tensorflow. If you already using tensorflow 2.0, you can directly fit keras models on TFRecord datasets.

Furthermore the tensorflow implementaion was always (slightly) faster.

Here is a more detailed explaination.

Cheers

submitted by /u/ixeption
[link] [comments]

[D] Help with fine-tuning for text classification task

Hi r/MachineLearning,

Testing out fine-tuning BERT and ULMFit for text classification. I’ve followed various tutorials using FastAI and PyTorch, but haven’t yet gotten good results at all – would love to get some input and see if my approach to this problem is reasonable.

My problem is take a short snippet of text – anywhere from 10 – 200 characters and predict one from 2,510 categories that represent words (one-hot vector). I’m using CrossEntropyLoss for the problem and both BERT and ULMFit do not seem to be doing great after fine-tuning – I can’t even get BERT begin to make non-naive predictions. I believe this is due to the large number of classes, but I had thought there would be enough signal in the text for BERT to make a guess at one of the categories after seeing some examples. It ends up just predicting the most common class.

I have ~10MM samples for the data and I’ve been testing to hopefully see some basic results on ~500K examples. Is this reasonable? Should I just try throwing all of the data in there and see what happens?

My data is fairly messy as well since it’s raw text, should I be cleaning it much? Haven’t found many recommendations on this front yet and all of the SOTA metrics appear to be on clean text afaik. As far as spelling goes as well wondering if that’s an issue.

Any comments/thoughts on the approach? If it would help I can post the code (can’t post the data unfortunately), but it’s mostly copied from some tutorials with a few modifications so don’t know how much that will help.

Thanks in advance for the help.

submitted by /u/snowcrashed617
[link] [comments]