Skip to main content

Blog

Learn About Our Meetup

5000+ Members

MEETUPS

LEARN, CONNECT, SHARE

Join our meetup, learn, connect, share, and get to know your Toronto AI community. 

JOB POSTINGS

INDEED POSTINGS

Browse through the latest deep learning, ai, machine learning postings from Indeed for the GTA.

CONTACT

CONNECT WITH US

Are you looking to sponsor space, be a speaker, or volunteer, feel free to give us a shout.

Category: Reddit MachineLearning

[D] What’s the prevalence of various languages in text summarization research?

My understanding so far has been that most of the research on text summarization has been done in English. However, I can’t find any reliable numbers for this. My best idea so far has been to search for “automatic summarization <language>” for a few languages on Google Scholar and see the number of results to get a rough estimate of the proportions. I get 42k for English, 25k for French, 24k for Spanish… But more surprising is I find 46k for Chinese. I would expect the results to be biased towards English, since my keywords are in English. Is it possible that more research has been done in summarization for Chinese than for English? Or am I overlooking something? Can you get more accurate numbers?

submitted by /u/Syncrossus
[link] [comments]

[D] Looking for an advice about Human Activity Recognition.

Hi r/MachineLearning. I hope this post is welcomed here because I couldn’t think of another better place and I really believe that you guys can help me. In the coming year I will start writing my dissertation. The topic I’m interested in is about Human Activity Recognition. I talked with my advisor professor and we agreed about two approaches:

  1. Research oriented. For example coming with different architectures, see what kind of videos are hard for current state of the art methods, etc. From what I saw the datasets are quite big and the current neural networks require few weeks to train and I don’t have access to such computational power. And here comes my first question: given this situation, what are the things you think that I should focus on if I want to go on this route?
  2. Doing something practical. For example to recognize the fine-grained actions from a specific sport/activity and to do something with them. The problem here is that there are not too many datasets, and I thought that maybe you know some interesting datasets regarding this aspect.

I have to mention that I personally prefer the first option, but I’m open to suggestions.

Thanks for the help and sorry if I posted in the wrong place.

submitted by /u/IonutCalofir
[link] [comments]

[P] Stylegan Music Video

We made a music video using NVIDIA’s styleGAN. You can check it out here: https://youtu.be/bCJXnRFGoSE .

Methodology

We first produced a mel-scaled spectrogram for the piece of music. We tweaked the arguments such that each time-step of the spectrogram corresponds to 16.7ms (duration of a frame @60fps). The frequency dimension of the spectrogram is scaled to match styleGAN’s input dimension.

Then we explored a pre-trained (on faces) styleGAN’s input space for interesting output images. The way we performed the exploration was to compute the gradient of the mean squared error between styleGAN’s output image and a real image (which we had chosen), with respect to a random input. Then with steps of gradient descent we search for inputs which produce outputs similar to our real image. We wanted “non-realistic, creepy faces”, which we got by using extreme hyper-parameters in this exploration phase, by swapping the colors of the output and by carefully choosing the custom target image. For each generated image we also saved the input vector (512 dimensional) which lead to it.

Finally, we made a large spreadsheet in which each row is a beat of the song (175 bpm for most parts). We assigned various generated images we liked at different parts of the song (usually at intervals 4 beats). We turned this spreadsheet into a large input array of dimensions equal to the mel-scaled spectrogram, by linearly interpolating between the pre-chosen generated images at the intervals dictated by the spreadsheet. We add this input matrix to the spectrogram with some weights and feed it to the pre-trained styleGAN. The outputs are the frames of the video.

(For the first few seconds of the song we also used some real footage which we morphed with generated faces)

Discussion

Throughout the project we felt that there must be a better way to do targeted searches of the input space. For styleGAN there is some interpretability to each dimension of the input, however we found it hard to make use of this, especially when the target image was not strictly a face (a skull for example). What are other ways in which we can answer the question “what inputs of this (differentiable) black box lead to a desired output?”

submitted by /u/kinezodin
[link] [comments]

[R] Enriching BERT with Knowledge Graph Embeddings for Document Classification

In this paper, we focus on the classification of books using short descriptive texts (cover blurbs) and additional metadata. Building upon BERT, a deep neural language model, we demonstrate how to combine text representations with metadata and knowledge graph embeddings, which encode author information. Compared to the standard BERT approach we achieve considerably better results for the classification task. For a more coarse-grained classification using eight labels we achieve an F1- score of 87.20, while a detailed classification using 343 labels yields an F1-score of 64.70. We make the source code and trained models of our experiments publicly available.

Paper: https://arxiv.org/abs/1909.08402

Code: https://github.com/malteos/pytorch-bert-document-classification

submitted by /u/muwnd
[link] [comments]

[N] Google starts AI research lab in Bangalore, India

Google Research India will be led by Manish Gupta, a renowned computer scientist and ACM Fellow with a background in deep learning across video analysis and education, compilers and computer systems. We’re also excited to have Professor Milind Tambe join us on a joint appointment from Harvard University as Director of AI for Social Good. Professor Tambe will build a research program around applying AI to tackle big problems in areas like healthcare, agriculture, or education.

The lab in Bangalore will be part of and support Google’s global network of researchers: participating in conferences, publishing research in scientific papers, and collaborating closely with one another. We’re also exploring the potential for partnering with India’s scientific research community and academic institutions to help train top talent and support collaborative programs, tools and resources.

https://blog.google/around-the-globe/google-asia/google-research-india-ai-lab-bangalore/

submitted by /u/hardmaru
[link] [comments]

[P] How we made landmark recognition in Cloud Mail.ru, and why

With the advent of mobile phones with high-quality cameras, we started making more and more pictures and videos of bright and memorable moments in our lives. Many of us have photo archives that extend back over decades and comprise thousands of pictures which makes them increasingly difficult to navigate through.

For this purpose, we at Mail.ru Computer Vision Team have created and implemented systems for smart image processing, including landmark recognition. Photos with landmarks are essential because they often capture highlights of our lives (journeys, for example). These can be pictures with some architecture or wilderness in the background. This is why we seek to locate such images using Deep Learning, and make them readily available to users.

https://medium.com/@andrei.boiarov/how-we-made-landmark-recognition-in-cloud-mail-ru-and-why-715b5f72e6d4

submitted by /u/pvl18
[link] [comments]

[D] Neural Architecture Search

Recently, Neural Architecture Search is coming back to the research spotlight. Elsken et al. published a survey on this topic (https://arxiv.org/pdf/1808.05377.pdf ) but the development is fast and many new works are emerging. For example, there is Weight Agnostic Neural Network (WANN) https://arxiv.org/abs/1906.04358 that demonstrates that Neural Architectures can be more significant than the weights of the network. You can read of the list of paper in this topic at https://www.automl.org/automl/literature-on-neural-architecture-search/ . Nevertheless, this type of topic is already researched in 1990 ( https://pdfs.semanticscholar.org/2118/55f1de279c452858177331860cbc326351ab.pdf ), are there still significance in improvement? If so, how much?

Are researchers just making up new Neural Architecture Search methods for publication, or is there really a big difference? Are there any work that focused on a detailed comparison for Neural Architecture Search.

submitted by /u/RTengx
[link] [comments]

[N] Google swallows DeepMind Health

From their blog:

Over the last three years, DeepMind has built a team to tackle some of healthcare’s most complex problems—developing AI research and mobile tools that are already having a positive impact on patients and care teams. Today, with our healthcare partners, the team is excited to officially join the Google Health family. Under the leadership of Dr. David Feinberg, and alongside other teams at Google, we’ll now be able to tap into global expertise in areas like app development, data security, cloud storage and user-centered design to build products that support care teams and improve patient outcomes.

During my time working in the UK National Health Service (NHS) as a surgeon and researcher, I saw first-hand how technology could help, or hinder, the important work of nurses and doctors. It’s remarkable that many frontline clinicians, even in the world’s most advanced hospitals, are still reliant on clunky desktop systems and pagers that make delivering fast and safe patient care challenging. Thousands of people die in hospitals every year from avoidable conditions like sepsis and acute kidney injury and we believe that better tools could save lives. That’s why I joined DeepMind, and why I will continue this work with Google Health.

We’ve already seen how our mobile medical assistant for clinicians is helping patients and the clinicians looking after them, and we are looking forward to continuing our partnerships with The Royal Free London NHS Foundation Trust, Imperial College Healthcare NHS Trust and Taunton and Somerset NHS Foundation Trust.

On the research side, we’ve seen major advances with Moorfields Eye Hospital NHS Foundation Trust in detecting eye disease from scans as accurately as experts; with University College London Hospitals NHS Foundation Trust on planning cancer radiotherapy treatment; and with the US Department of Veterans Affairs to predict patient deterioration up to 48 hours earlier than currently possible. We see enormous potential in continuing, and scaling, our work with all three partners in the coming years as part of Google Health.

It’s clear that a transition like this takes time. Health data is sensitive, and we gave proper time and care to make sure that we had the full consent and cooperation of our partners. This included giving them the time to ask questions and fully understand our plans and to choose whether to continue our partnerships. As has always been the case, our partners are in full control of all patient data and we will only use patient data to help improve care, under their oversight and instructions.

I know DeepMind is proud of our healthcare work to date. With the expertise and reach of Google behind us, we’ll now be able to develop tools and technology capable of helping millions of patients around the world.

https://www.blog.google/technology/health/deepmind-health-joins-google-health/

submitted by /u/sensetime
[link] [comments]