Skip to main content

Blog

Learn About Our Meetup

5000+ Members

MEETUPS

LEARN, CONNECT, SHARE

Join our meetup, learn, connect, share, and get to know your Toronto AI community. 

JOB POSTINGS

INDEED POSTINGS

Browse through the latest deep learning, ai, machine learning postings from Indeed for the GTA.

CONTACT

CONNECT WITH US

Are you looking to sponsor space, be a speaker, or volunteer, feel free to give us a shout.

[N] Interview with Hamel Husain on semantic code search research at GitHub

“We hope that the community can use this dataset to improve developer tools generally, which may include semantic code search. We hope that the state of the art with regards to representation learning of code is advanced because researchers and practitioners now have a common dataset and a forum in which to discuss results. We also hope that the uniqueness of the dataset will inspire the community to uncover new approaches and techniques for code and natural language understanding.”

That’s a quote from the one of the authors of CodeSearchNet – datasets, tools, and benchmarks for representation learning of code. This research on semantic code search has been posted here before as news, but I thought some people here might be interested to know some of the details behind what goes into a project like this at a big company. I interviewed Hamel Husain, a machine learning engineer at GitHub about how the project started and evolved into a wider open source effort to involve the ML research community. Hope there are useful takeaways for people here.

Here’s a link to the interview: https://sourcesort.com/interview/hamel-husain-on-semantic-code-search

And here’s a link to the original paper on arXiv: https://arxiv.org/abs/1909.09436

submitted by /u/Jefro118
[link] [comments]