Author: torontoai

[D] Decision tree that can detect phishing links: model is trained (I think..), now what?

Written on December 24, 2019. Posted in Reddit MachineLearning.

Hello, new to ML and also not a very math-oriented person. I am creating a Discord bot that will be able to detect phishing links by using a decision tree (still need to figure out how to link the trained ML model to the bot).

The current accuracy of this program is 90% which seems pretty good on the surface but how can I tell if its *actually* 90%? I was reading about confusion matrixes and training via entropy, maybe either of those is good to use? Every run-through of the program the accuracy decreases. Why?

On the top line of my code you can see where I got my dataset which contains approximately 2000 instances. Is that enough? I found this dataset which contains 5000 instances https://data.mendeley.com/datasets/h3cgnj8hft/1 . Can I train the decision tree on more than one dataset? Is that a good idea? Should I combine both datasets into one?

Ultimately, what should be my next step(s)?

https://pastebin.com/XE0Ss9hq

submitted by /u/North_Bug
[link] [comments]

[D] NIPS vs. NeurIPS: guest post by Steven Pinker

Written on December 24, 2019. Posted in Reddit MachineLearning.

From Scott Aaronson’s Shtetl-Optimized blog, an open email from Steve Pinker:

I appreciate your frank comments. At the same time, I do not agree with them. Please allow me to explain.

If this were a matter of sexual harassment or other hostile behavior toward women, I would of course support strong measures to combat it. Any member of the Symposium who uttered demeaning comments toward or about women certainly deserves censure.

But that is not what is at issue here. It’s an utterly irrelevant matter: the three-decades-old acronym for the Neural Information Processing Symposium, the pleasingly pronounceable NIPS. To state what should be obvious: nip is not a sexual word. As Chair of the Usage Panel of the American Heritage Dictionary, I can support this claim.

(And as my mother wrote to me: “I don’t get it. I thought Nips was a brand of caramel candy.”) [Indeed, I enjoyed those candies as a kid. –SA] Even if people with an adolescent mindset think of nipples when hearing the sound “nips,” the society should not endorse the idea that the concept of nipples is sexist. Men have nipples too, and women’s nipples evolved as organs of nursing, not sexual gratification. Indeed, many feminists have argued that it’s sexist to conceptualize women’s bodies from the point of view of male sexuality.

If some people make insulting puns that demean women, the society should condemn them for the insults, not concede to their puerility by endorsing their appropriation of an innocent sound. (The Linguistics Society of America and Boston Debate League do not change their names to disavow jejune clichés about cunning linguists and master debaters.) To act as if anything with the remotest connection to sexuality must be censored to protect delicate female sensibilities is insulting to women and reminiscent of prissy Victorian taboos against uncovered piano legs or the phrase “with the naked eye.”

Any harm to the community of computer scientists has been done not by me but by the pressure group and the Symposium’s surrender. As a public figure who hears from a broad range of people outside the academic bubble, I can tell you that this episode has not played well. It’s seen as the latest sign that academia has lost its mind—that it has traded reasoned argument, conceptual rigor, proportionality, and common sense for prudish censoriousness, snowflake sensibility, and virtue signaling. I often hear from intelligent non-leftists, “Why should I be impressed by the scientific consensus on climate change? Everyone knows that academics just fall into line with the politically correct position.” To secure the credibility of the academy, we have to make reasoned distinctions, and stop turning our enterprise into a laughingstock.

To repeat: none of this deprecates the important effort to stamp out harassment and misogyny in science, which I’m well aware of and thoroughly support, but which has nothing to do with the acronym NIPS.

You are welcome to share this note with interested parties.

Best,

Steve

submitted by /u/milaworld
[link] [comments]

[R] Genetically generated regex. I have a trouble understanding part of the paper.

Written on December 24, 2019. Posted in Reddit MachineLearning.

Hello, I’m working on task of automatically generating regular expressions. I base my work on this paper:

https://esc.fnwi.uva.nl/thesis/centraal/files/f565297164.pdf

However I have a problem of understanding part on the 19th page, the part about ‘r` node in Enclosing Node part.

I’m not sure what ‘based on the number of capturing groups’ means. It it exactly the number of capturing groups in the part of a regexp before ‘r”, or number of matches of the expression in a string, or something else? And what’s the use of it?

I would be very thankful for any suggestions. This part says:

https://preview.redd.it/lh97wet58r641.png?width=1559&format=png&auto=webp&s=256cc717d5213ec159f166d887399b4da70f062d

Paper is really good read, by the way. Greatly written. However I don’t have much experience with REGEX so I’m not sure what this specific part means.

submitted by /u/Slajni
[link] [comments]

Demand Planner, Nutrition – Nestlé – North York, ON

Written on December 24, 2019. Posted in Toronto Job Postings.

Maintain a robust post-mortem learning log. Has sufficient knowledge to be able to generate and maintain statistical forecasts, leveraging data analytics, AI …
From Nestle – Wed, 25 Dec 2019 05:49:41 GMT – View all North York, ON jobs

Data Scientist (1 year contract) – Nestlé – North York, ON

Written on December 24, 2019. Posted in Toronto Job Postings.

In addition, should have a broad technology background with experience in machine learning, deep learning or artificial intelligence. Dec 18, 2019, 7:13:15 PM.
From Nestle – Wed, 25 Dec 2019 05:48:54 GMT – View all North York, ON jobs

[D] How to differentiate between your idea or implementation being wrong?

Written on December 24, 2019. Posted in Reddit MachineLearning.

As the title says, I’m curious as to what is the most efficient way to figure out if your idea is junk or your implementation messed up somewhere. Especially if your implementation failed due to strange autograd quirks of the framework you’re using (which has bit me in the past sometimes). I guess the common sense ones are:

Test it on toy cases (takes a long time to design)
Get advisor / someone else in the lab to take a look (often they agree with the general idea, but won’t have the inclination/time to study the implementation very deeply, understandably)
Git gud (pretty hard)

submitted by /u/tensorflower
[link] [comments]

[D] Evaluating “A Neural Algorithm of Artistic Style” by Gatys et al.

Written on December 24, 2019. Posted in Reddit MachineLearning.

Paper: https://arxiv.org/abs/1508.06576

This is an exciting paper because it’s the first to introduce artistic style transfer using pre-trained neural networks.

What’s great is how the paper demonstrated how to extract content (e.g., shapes, contours) from an image and how to extract style from multiple layers (e.g., some layers extract fine-grained styles while others extract overall style). Combining these techniques can yield realistic and professional results.

The main issue is timing and compute power required. The paper extracts content features of an image with VGG19 and extracts style features from multiple layers with VGG19. Because it requires extensive compute power for feature extraction and for optimizing style and content loss, it takes 500-1000 iterations just to produce a low-resolution image. It would be ideal if the algorithm could produce results in a few iterations.

Has anybody tried RESNET or any other state of the art network instead of VGG19?
Does anyone know of a transformer network that produces similar results to this paper?
Any recommendations on better style transfer papers?

submitted by /u/hotpot_ai
[link] [comments]

[D] Causal Relationship Mining papers?

Written on December 23, 2019. Posted in Reddit MachineLearning.

Does anyone have any good papers that would be worth checking out for mining causal relationships? In particular with continuous variables and high dimensionality data sets.

submitted by /u/bearingtheseeds
[link] [comments]

[D] Yann LeCun, Some folks still seem confused about what deep learning is, what do community really think the definition is.

Written on December 23, 2019. Posted in Reddit MachineLearning.

LeCun tweeted here: https://twitter.com/ylecun/status/1209497021398343680?s=20, about what is the real definition of DL as he said “Some folks still seem confused about what deep learning is. Here is a definition: DL is constructing networks of parameterized functional modules & training them from examples using gradient-based optimization”

I think he is pointing to Gary Marques’debate with Bengio, and it seems controversial discussion over the thread, I don’t know what do you think about LeCun statement about DL definition.

submitted by /u/meldiwin
[link] [comments]

[D] Does the opaqueness of most dating app algorithms concern anyone else?

Written on December 23, 2019. Posted in Reddit MachineLearning.

At the risk of sounding like I’m wearing a tinfoil hat, I’d like to vent regarding how messed up I think it is that most dating apps lack transparency when it comes to their match-making algorithms.

Before I jump in, let me just start by saying that a large percentage of dating takes place online these days. Therefore, anyone who wants to argue that we should all just meet in person can kindly frick off because you’re missing the point of this post.

My reasoning is as follows:

How a dating algorithm functions will directly impact one’s chance of successfully finding a mate.
Whether or not you can find a good mate will have a huge impact on your overall quality of life, mental health, financial success, etc.
The ability to alter an algorithm to selectively favor or disfavor certain populations chance at successful mating via tweeking of a few lines of code is a unique superpower never before unleashed upon the world.
Setting aside any notions of bad actors purposely inhibiting your ability to get laid (which who knows… maybe that could happen), isn’t it at all concerning that this mega-powerful ability has close to zero public oversight?

And yes, I’ll have to admit that part of the reason I am making this post is that I honestly feel like I might have been shadowbanned on Tinder for reasons that are unclear to me. It’s just a hunch of course, but it seems bizarre how much my match rate has decreased over the past couple of years. I’m bothered that I have no insight into why this might be. Maybe I’m only allowed to date people in my economic circle (i.e. poor) with the rest of the undesirables. Maybe I haven’t posted on Reddit enough in the past. Dunno.

I’d love to hear others’ thoughts on this matter.

submitted by /u/QMred
[link] [comments]

Blog

Learn About Our Meetup

5000+ Members

MEETUPS

JOB POSTINGS

CONTACT

Author: torontoai

[D] Decision tree that can detect phishing links: model is trained (I think..), now what?

[D] NIPS vs. NeurIPS: guest post by Steven Pinker

[R] Genetically generated regex. I have a trouble understanding part of the paper.

Demand Planner, Nutrition – Nestlé – North York, ON

Data Scientist (1 year contract) – Nestlé – North York, ON

[D] How to differentiate between your idea or implementation being wrong?

[D] Evaluating “A Neural Algorithm of Artistic Style” by Gatys et al.

[D] Causal Relationship Mining papers?

[D] Yann LeCun, Some folks still seem confused about what deep learning is, what do community really think the definition is.

[D] Does the opaqueness of most dating app algorithms concern anyone else?