Skip to main content

Blog

Learn About Our Meetup

5000+ Members

MEETUPS

LEARN, CONNECT, SHARE

Join our meetup, learn, connect, share, and get to know your Toronto AI community. 

JOB POSTINGS

INDEED POSTINGS

Browse through the latest deep learning, ai, machine learning postings from Indeed for the GTA.

CONTACT

CONNECT WITH US

Are you looking to sponsor space, be a speaker, or volunteer, feel free to give us a shout.

Category: Reddit MachineLearning

[P] Would like some ideas for a student project based on city data

Hello everyone I have a project for my AI machine learning course that will be based on city data, namely Vancouver. The below link are the data sets that our project can be based upon. We are free to use outside data sets but must relate it to our city.

https://opendata.vancouver.ca/explore

I would really appreciate any guidance or input. We’re having a difficult time coming up with ideas, the only ones we have come up with are predicting of house property, bike theft, general crime prediction all of which would combine features from other data sets.

Thank you for reading my post!

submitted by /u/JohnMcClapperson
[link] [comments]

[D] I need to interpolate some of my data and have some design decisions about where in my pipeline this should happen.

I am working with a database that is spread across 7 tables and for ML stuff I need to join them together. However, as some of these rows are not sampled as frequently as others, this leaves a lot of nulls for some features. I want to interpolate these values but I’m not sure the most efficient way to do so. In other words, let’s say I have feature X sampled every 10 ms and feature Y every 1 second, and a third feature Z sampled every 15 seconds. I could store it in the database, but I don’t know if allowing that kind of storage capacity is feasible for us. Alternatively, I could calculate it for each row when I get batches for training, but I’m afraid that will become a bottleneck depending on how fast the interpolation is. Is there some obvious way of interpolating this efficiently that I’m not thinking of that will allow me to save on memory space?

submitted by /u/zcleghern
[link] [comments]

[D] For GNN’s are gradients normally tracked on neighborhood aggregation operations (e.g. max, mean)?

I am writing a GNN from scratch, to demonstrate to myself that I understand all the required concepts.

I am a bit confused on whether neighborhood aggregation operations require gradients to be tracked through those operations like mean and max of neighbors embeddings. In my code where I perform these operations, currently I do them within a with torch.no_grad() block because if I don’t each epoch takes forever.

Here my code for those operations:

def neighborhood_aggregation(self, adj_lists, feat, agg_method): # adj_lists is a dict of neighbors for every node in graph # e.g. adj_list = {0:{1, 4, 5, 6}, 1: {2, 4, 5}, ...} # node 0 has neighbors 1, 4, 5, 6 with torch.no_grad(): # construct aggregated neighborhood embedding dim = list(feat.size()) n_nodes = dim[0] feat_dim = dim[1] aggregated_embed = torch.Tensor(n_nodes, feat_dim) # aggregated embeddings for all nodes in graph. embed_element_vec = torch.arange(feat_dim) # for node_id, neighbor_node_ids in adj_lists.items(): neighborhood_embedding = feat[list(neighbor_node_ids), :] if agg_method == 'mean': aggregated_neigborhood_embedding = torch.mean(neighborhood_embedding, 0) elif agg_method == 'pool': aggregated_neigborhood_embedding = torch.max(neighborhood_embedding, 0)[0] else: raise KeyError('Aggregator type {} not recognized.'.format(agg_method)) aggregated_embed[node_id, embed_element_vec] = aggregated_neigborhood_embedding return aggregated_embed 

Note: The above code works, and I am getting very good results with it. It’s just I am not sure if what I am doing is wrong. IF it is wrong I was thinking that I need a 3D tensor for the aggregated_embed tensor [n_nodes, n_neighbors, embed_dim] (which requires_grad=False) and perform the mean/max on that tensor which would track gradients.

Thanks for any help.

submitted by /u/Muunich
[link] [comments]

[N] Even notes from Siraj Raval’s course turn out to be plagiarized.

[N] Even notes from Siraj Raval's course turn out to be plagiarized.

More odd paraphrasing and word replacements.

From this article: https://medium.com/@gantlaborde/siraj-rival-no-thanks-fe23092ecd20

Left is from Siraj Raval’s course, Right is from original article

‘quick way’ -> ‘fast way’

‘reach out’ -> ‘reach’

‘know’ -> ‘probably familiar with’

‘existing’ -> ‘current’

Original article Siraj plagiarized from is here: https://www.singlegrain.com/growth/14-ways-to-acquire-your-first-100-customers/

submitted by /u/Kitchen_Extreme
[link] [comments]

[P] Lyrics Generator Twitter Bot

[P] Lyrics Generator Twitter Bot

I fine-tuned 2 small GPT-2 models (124M parameters) and created twitter bots that interact with Twitter users.

I have shared the code and useful things I learned and used hoping it will help somebody in the following repository :

https://jsalbert.github.io/lyrics-generator-twitter-bot/

The following samples correspond to the outputs of such models.

Eminem Bot Lyrics (@rap_god_bot)

https://preview.redd.it/anndufmguhv31.png?width=600&format=png&auto=webp&s=e027a50442f71b64fbcbe8821ed843c6d6823ead

Music Storytelling Bot Lyrics (@musicstorytell)

https://preview.redd.it/lo8qhzshuhv31.png?width=600&format=png&auto=webp&s=c0a609f649bb3daeeea18aa91c43165c7216f038

submitted by /u/jsalbert_
[link] [comments]

[D] The roots of natural language processing can be traced back to Kabbalist mystics

For people interested in the history of technology — here’s an eccentric essay arguing that the first examples of NLP happened in medieval times. Mystics studying the Kabbala devised “sacred rules” for combining letters to generate prophetic texts and, sometimes, to create golems.

https://spectrum.ieee.org/tech-talk/robotics/artificial-intelligence/natural-language-processing-dates-back-to-kabbalist-mystics

“While specific technologies have changed over time, the basic idea of treating language as a material that can be artificially manipulated by rule-based systems has been pursued by many people in many cultures and for many different reasons. These historical experiments reveal the promise and perils of attempting to simulate human language in non-human ways—and they hold lessons for today’s practitioners of cutting-edge NLP techniques.”

submitted by /u/newsbeagle
[link] [comments]