Skip to main content

Blog

Learn About Our Meetup

5000+ Members

MEETUPS

LEARN, CONNECT, SHARE

Join our meetup, learn, connect, share, and get to know your Toronto AI community. 

JOB POSTINGS

INDEED POSTINGS

Browse through the latest deep learning, ai, machine learning postings from Indeed for the GTA.

CONTACT

CONNECT WITH US

Are you looking to sponsor space, be a speaker, or volunteer, feel free to give us a shout.

Category: Reddit MachineLearning

[Discussion] What is the state-of-the-art for entity extraction and relation extraction?

Hi,

I am looking for the state-of-the-art entity extraction/relation extraction algorithms that are practical to implement and use for commercial information extraction. An example:

Mr. Wilken is the CEO of Foobar, inc.

Entities are Mr. Wilken, CEO, Foobar, inc. Mr. Wilken’s title is CEO. Mr. Wilken works at Foobar, Inc (transitively he is then the CEO of Foobar, Inc.).

In my experience I’ve used CRF used hand crafted features for entity tagging followed by a classifier to determine relations between entities use hand crafted features. This is a pretty old school approach and does not leverage any of the advances in word embeddings (Glove, BERT, etc.). I know there are also methods for doing joint entity + relation extraction.

There are dozens of papers on Google scholar, and I’m not sure which ones would be worth implementing. I’m looking for recommendations of recent papers to read that would get me started.

submitted by /u/Tash_is_Aslan
[link] [comments]

[R] Emergent Tool Use from Multi-Agent Interaction

Paper: https://d4mucfpksywv.cloudfront.net/emergent-tool-use/paper/Multi_Agent_Emergence_2019.pdf

Blog: https://openai.com/blog/emergent-tool-use/

TLDR: Hide and seek game where there are moveable blocks + ramps. Hiders and seekers learn to use them to complete task. Trained with PPO + transformer NN for representing objects.

Abstract: Through multi-agent competition, the simple objective of hide-and-seek, and standard reinforcement learning algorithms at scale, we find that agents create a selfsupervised autocurriculum inducing multiple distinct rounds of emergent strategy, many of which require sophisticated tool use and coordination. We find clear evidence of six emergent phases in agent strategy in our environment, each of which creates a new pressure for the opposing team to adapt; for instance, agents learn to build multi-object shelters using moveable boxes which in turn leads to agents discovering that they can overcome obstacles using ramps. We further provide evidence that multi-agent competition may scale better with increasing environment complexity and leads to behavior that centers around far more human-relevant skills than other self-supervised reinforcement learning methods such as intrinsic motivation. Finally, we propose transfer and fine-tuning as a way to quantitatively evaluate targeted capabilities, and we compare hide-and-seek agents to both intrinsic motivation and random initialization baselines in a suite of domain-specific intelligence tests.

submitted by /u/ivalm
[link] [comments]

[D] Suggestions on good practice when merging k-means centroids

Hi, I posted this on /r/datascience but thought I’d x-post for visibility.

I was wondering if I could get some feedback into whether my methodology is problematic or not.

I’m working with a pre-established set of 12 cluster centroids in a classification problem, based on the output of a 42-element 2D joint histogram. When classifying points, these histograms are collapsed so that the classification is only done on a 3-element vector representing the mean quantities of the data point.

The purpose of this is to identify cloud types, to feed into some of my cluster-specific analysis. Now, three adjacent cluster centroids all refer to the same cloud type, however they have different ‘thicknesses’. In my final work, I’d like there to be just a single cluster to represent this type. I worry though, by just merging by, for instance, taking the mean of these centroids, the classification step will miss many points that would’ve ordinarily been assigned to these clusters, because they might then be closer to another centroid which doesn’t represent the data point accurately.

My idea is to classify my datapoints with a codebook containing the three centroids (say, clusters 1, 2, and 3). After allocating all my points, I’d then merge the clusters together into a single classification. This would then result in a cluster which has been manually extended to capture points that wouldn’t ordinarily be in it.

Is this a problematic way of merging clusters, as opposed to say, taking the mean of the three cluster centroids? Or are there better ways of doing this?

I’ve drawn out a basic diagram attempting to illustrate what I mean – https://i.imgur.com/aGIW3f5.jpg Thanks a lot in advance

EDIT: I’ve looked at agglomerative clustering but I’m working off an (almost) nicely defined set of clusters, aside from this issue. I tried merging cluster centroids using agglomerative but unfortunately it agglomerated together two which I didn’t want merged. (PS. is this how you do agglom clustering? Can you just train the agglom algorithm by passing it the original k-means centroids?)

submitted by /u/The_Foetus
[link] [comments]

[D] Consistency of Impact for Selected Features Across Model Types

I’ve been experimenting with DataRobot as of late and I’ve noticed that given a fixed set of features, the choice of n (e.g. 10) “most impactful” features differs significantly from one model type to another. Given that different algorithms may have differing sensitivity to input data type and specific effects present in the training data set, how would one go about shortlisting a set of features that would be consistently impactful across a variety of algorithms?

Current train of thought has me stuck at two options: 1. Rank top n features by model impact for various models (say top 5 best performing models) and select features that show the least change in ranking; or 2. Examine pairwise correlation with target for all features and if variables with low correlation are selected in the final model, sanity check for nonlinear relationship using contour plots.

Your thoughts/comments are much appreciated.

P. S. Discussion is motivated by my model validation team that insists on me benchmarking every model against logistic regression because that’s the only one they understand. Theme of the month is “Why are the variables selected in your model significantly different from those used in our traditional LR model?”.

submitted by /u/furyincarnate
[link] [comments]

[D] Keras output of simple network dependent on batch size

https://github.com/keras-team/keras/issues/13328

Depending on the other data in the batch, keras will return different results. My test shows very minor differences but alas, they should all be identical. Additionally, I have datasets where this error grows considerably. I am working on creating a test I can release showing the issue.

EDIT 1: fixed typo.

submitted by /u/idg101
[link] [comments]