Skip to main content

Blog

Learn About Our Meetup

5000+ Members

MEETUPS

LEARN, CONNECT, SHARE

Join our meetup, learn, connect, share, and get to know your Toronto AI community. 

JOB POSTINGS

INDEED POSTINGS

Browse through the latest deep learning, ai, machine learning postings from Indeed for the GTA.

CONTACT

CONNECT WITH US

Are you looking to sponsor space, be a speaker, or volunteer, feel free to give us a shout.

Category: Reddit MachineLearning

[D] How do you use unsupervised Learning methods with time-series data?

I have a question about a problem that I am trying to solve.

I have clinical data (time-series measurements), and I aim to understand patients’ problems. Every measurement is reporting data in slightly different way depends on the behavior of the patient / equipement used to monitor patient.

This later presents three challenges:

1/ missing data for some measurements for some time.

2/ normalization problem. we don’t know have a clear idea on min/max of medical values (I assume it is hardly predictable in some cases).

3/ Since labeling such data is very costly. I can get some labeled data but it would be really a small subset.

What do I have?

For the sake of an example, let’s say that I have three measurements (measurement A, measurement B, measurement C).

I have time series of measurement A, B, C for healthy patients (they recovered and they are staying in hospital for few days), and I have time series of measurement A, B, C for patients who struggle with some problems.

I only know that information. The idea is to categorize patient problems over time and use it in other places where some specialized doctors lack expertise to identify problems. How can I approach this?

A t1, t2,t3,<missing>,t5,t6
B t1, t2,t3,t4,t5,t6
C t1, t2,t3,X,t4,<missing>,t6

If I see these time series, I would say that it is patient is struggling with problem X

P.S: I have > hundered measurements.

Suggested approach

Since the three measurements don’t report data in the same time window, I averaged on time window T. I focused only on time series of sick patients. I tried a naive approach of apply clustering with temporal constraints. Since it;s a naive approach to the problem, I started looking/exploring other methods.

Questions: 1/ How can I leverage measurements of healthy patients (use it as a guide) and the little labeled data I have 2/ what are some of the methods that I can use for unsupervised learning to tag/cluster problems (doctor will later identify them)?

I am seeking advises/recommendations on methods to explore. Do you have any suggestions, ideas and papers to explore. I would be thankful.

submitted by /u/__Julia
[link] [comments]

[P] Drone ML Project

Hello,

I want to work on a machine learning drone project that deals with construction. I am new to machine learning and programming AI, so I am looking for a partner to work with who is able to help me. Please direct message me for further details if you are interested. I am using the Parrot AR Drone 2.0 for this project.

submitted by /u/psat11VMTX
[link] [comments]

Collaborate on an idea for initialization of neural networks [R] [P]

Most weight initialization strategies for neural networks depend on controlling variance (and mean) of the propagating signals. However, they rely on some assumptions which do not hold always in the practice.

For example using relu immediately shifts the mean.

I am thinking it might be possible to learn the initialization of the network with an appropriate cost function. A cost function to adjust weights so the variance of the input variance smoothly transforms into output variance.

Anybody wants to collaborate on a quick project to test these, perhaps somebody who has worked on a problem like this before.

I have tried something preliminary but I need advice from an experienced NN experimenter:)

submitted by /u/fbtek
[link] [comments]

[D] Unzipping dataset in google drive using google colab is slow

I have my dataset sitting in my google drive as a .zip file. To unzip it I use this command:

!unzip -n /content/drive/AudioMNIST/dataset.zip -d /content/drive/AudioMNIST/

But it is incredibly slow, it takes a couple of seconds to extract one file (total number of files is 30000). I don’t have a deadline, but this speed is frustrating. Things I have tried so far:

  1. Uploading my dataset unzipped to my google drive, but it just won’t do it over the browser with the huge number of files.
  2. Uploading file by file to a folder in google drive using gdrive from my terminal, but apparently it’s not supported anymore. And I couldn’t figure out how to set up the credentials with the API
  3. And then there’s the cloud unzipper “Cloud Convert”, which is relatively fast but requires a subscription to obtain more than 25 minutes of conversion time daily. Another one was “ZIP Extractor”, which is faster the above colab command, but not by much.

Preferably I would like to upload my dataset file by file from my terminal with a python script or so. Has anyone been in this position?

submitted by /u/khawarizmy
[link] [comments]

[D] Machine learning in transportation

I’m interested in applying Machine Learning to manage traffic congestion for an engineering project, but I’m not sure how to approach it. I think my current idea is to use this data to produce a reinforcement learning algorithm to determine where new development could best alleviate congestion, but that seems somewhat model-driven. What other angles could I take to try and narrow down a complex issue like this?

submitted by /u/shadowknife392
[link] [comments]

[N] U.S. government limits exports of artificial intelligence software (Reuters)

This is a mainstream news story, and mainly about software exports for now, but should be relevant to this community. Any thoughts?

Reuters Article:

U.S. government limits exports of artificial intelligence software

WASHINGTON (Reuters) – The Trump administration took measures on Friday to crimp exports of artificial intelligence software as part of a bid to keep sensitive technologies out of the hands of rival powers like China.

Under a new rule which goes into effect on Monday, companies that export certain types of geospatial imagery software from the United States must apply for a license to send it overseas except when it is being shipped to Canada.

“They want to keep American companies from helping the Chinese make better AI products that can help their military,” said James Lewis, a technology expert with the Washington-based Center for Strategic and International Studies think tank.

The rule will likely be welcomed by industry, Lewis said, because it had feared a much broader crackdown on exports of most artificial intelligence hardware and software

The measure covers software that could be used by sensors, drones, and satellites to automate the process of identifying targets for both military and civilian ends, Lewis said, noting it was a boon for industry, which feared a much broader crackdown on exports of AI hardware and software.

The measure is the first to be finalized by the Commerce Department under a mandate from a 2018 law, which tasked the agency with writing rules to boost oversight of exports of sensitive technology to adversaries like China, for economic and security reasons.

Reuters first reported that the agency was finalizing a set of narrow rules to limit such exports in a boon to U.S. industry that feared a much tougher crackdown on sales abroad.

The rule will go into effect in the United States alone, but U.S. authorities could later submit it to international bodies to try to create a level playing field globally.

It comes amid growing frustration from Republican and Democratic lawmakers over the slow roll-out of rules toughening up export controls, with Senate Minority Leader Chuck Schumer, a Democrat, urging the Commerce Department to speed up the process.

“While the government believes that it is in the national security interests of the United States to immediately implement these controls, it also wants to provide the interested public with an opportunity to comment on the control of new items,” the rule release said.

Reporting by Alexandra Alper; Editing by Alistair Bell

https://www.reuters.com/article/us-usa-artificial-intelligence/u-s-government-limits-exports-of-artificial-intelligence-software-idUSKBN1Z21PT

submitted by /u/sensetime
[link] [comments]

[P] 64,000 pictures of cars, labeled by make, model, year, price, horsepower, body style, etc.

Download it here from my Google Drive. The size is 681MB compressed.

You can visit my GitHub repo here (code is in Python), where I give examples and give a lot more information. Leave a star if you enjoy the dataset!

It’s basically every single picture from the site thecarconnection.com. Picture size is approximately 320×210 but you can also scrape the large version of these pictures if you tweak the scraper. I did a quick classification example using a CNN: Audi vs BMW with CNN.

Complete list of variables included for all pics:

'Make', 'Model', 'Year', 'MSRP', 'Front Wheel Size (in)', 'SAE Net Horsepower @ RPM', 'Displacement', 'Engine Type', 'Width, Max w/o mirrors (in)', 'Height, Overall (in)', 'Length, Overall (in)', 'Gas Mileage', 'Drivetrain', 'Passenger Capacity', 'Passenger Doors', 'Body Style' 

submitted by /u/nicolas-gervais
[link] [comments]