Author: torontoai

[D] Swift for TensorFlow is currently the best System for ML. CMV

Written on January 5, 2020. Posted in Reddit MachineLearning.

After watching the talks and reading some of the docs it seems that Swfit for TensorFlow addresses most, if not all of the complaints people have about TensorFlow right now. The Swift tooling appears especially helpful with TF’s learning curve, which people often cite as one of the major downsides of TF.

What are your thoughts? Why do/don’t you use it and what would make you change your mind?

submitted by /u/arudomidas
[link] [comments]

[D] Time series analysis for machine employment support

Written on January 5, 2020. Posted in Reddit MachineLearning.

I have a (physical) machine that can be tuned by adjusting the values of some parameters A_1, …, A_n (n is around 10). This tuning affects some secondary parameters B_1, …, B_m that cannot be tuned by hand. The machine continuously produces an output X, and by looking at X in a time window, it is possible to decide if the machine was running stable or unstable.

All this information was logged for the past ~10 years, that is roughly around 25M data points.

The tuning of the machine is really complicated, as it can react very sensibly to parameter adjustments and also their influence is not quite well understood, so specialist interventions are needed to keep the machine running at a reasonable performance. The goal is to train a ML model that can support these interventions and generate some insights into how the parameters are related to the stability. For example we would be interested in something like ‘If you raise A_1, you need to lower A_2 in order for the machine to remain stable’ or ‘raising A_1 will increase B_1 in a few hours’.

Up until now we ignored the time component and only ran some clustering to find out which settings were used when the machine was running stable and which were used when it was running unstable. Sadly, the used settings were are greatly (it could have been stable with A_1=100 and A_1=300) and a usually a single setting could lead to a stable as well as an unstable machine, so the time information is crucial.

I am looking for ideas how to approach this task. I was thinking about sub dimensional motif discovery to find typical patterns, but I’m unsure how to link these patterns together.

submitted by /u/mexxfick
[link] [comments]

[D] Is this idea at all feasible? Should I pursue it?

Written on January 5, 2020. Posted in Reddit MachineLearning.

So I’ll start by saying that I don’t know that much about machine learning so I don’t know how plausible this idea is; that’s why I’m asking.

Essentially, start by downloading the entirety of some hentai site, with artist names and tags. Then feed it into a neural network of some description. Set it up so that you can feed in an artist name and tags and it’ll try to make a manga/doujin in the style of that artist with those tags. (Presumably it would have better results if you stick to tags that actually appear in that artist’s work?)

One question is, would it be best to stick to material in one language so it won’t make speech bubbles in a sort of English-Japanese-Chinese mishmash? (I realize it will have no way of actually knowing the meanings of the words or how they sound out loud but given enough data it should at least be able to correlate written words and phrases to pictures/situations, right?) Similarly, would it be best to stick to material in black and white (or in color)? There are at least 200,000 available mangas/doujins in Japanese in black and white; would that be anywhere close to enough data?

Would it be able to get more out of the training data if the dialogue was all transcribed in text and the pages were panel-by-panel tagged with who’s where, doing what? I realize this would take many man-hours.

How much processing power would I expect this to take? I’m assuming my 2.2 gigahertz, 4 gigabyte RAM laptop would be entirely inadequate.

And finally, if it didn’t end up producing anything coherent, would the results at least be funny?

submitted by /u/Terpomo11
[link] [comments]

[D] AISTATS 2020 paper decision notificafion

Written on January 5, 2020. Posted in Reddit MachineLearning.

According to the official website January 6 2020 the results for AISTATS submission should be released. Has anyone received the news yet?

submitted by /u/deschaussures147
[link] [comments]

[D] Streamlined ML curriculum to get from zero to research as quickly as possible

Written on January 5, 2020. Posted in Reddit MachineLearning.

As a PhD student of ML, I recently came into realization that most of the things I learned in college and required courses in PhD weren’t really necessary or useful for research in most areas of ML, and that I learned more relevant things outside of the curriculum by myself by reading recent papers. I think the prerequisite for most papers is usually limited to very few number of easily learnable topics and the cited recent papers, so I think more emphasis should be put on reading recent papers than textbooks of rather irrelevant topics.

I believe it is more efficient to learn whatever you think is necessary during your research rather than learning various things beforehand. So, rather than taking various CS & ML courses and then beginning the research, I believe it is better for people to begin research (e.g. reading the recent papers, implementing various ideas) as soon as possible. This way, while doing your research you would specialize to some specific fields and may find lack of some required knowledge. Then, you can take a course necessary for understanding it or just study it on your own if that works, since that’s what researchers usually do. Meanwhile, you can keep reading the recent papers, implement your ideas and accumulate your knowledge of things you cannot learn from textbooks or lectures.

The target of this curriculum is assumed to know at least single-variable calculus (if you know more, you can skip the topics you know!). This includes some advanced high school students. Since most researchers tend to have been a strong student, I set the pace of the curriculum fast. But it can be slowed down. A sample syllabus is provided for each course (taken from MITOCW and Stanford).

1st semester: Multi-variable Calculus [1], Linear Algebra [2], Elementary Probability & Statistics with emphasis on ML [3] (The syllabii should be modified to focus on ML and incorporate Python & Numpy use.)

2nd semester: Classical ML (covering various classical models quickly) [4], DNN course (focusing on CNN and Transformer (w/ pytorch impl.) with literature review mainly on post 2017 papers at the end) (modified ver. of [5, 6]), some supplementary CS course (covering various miscellaneous things you absolutely need to know).

After these semesters, you would have an understanding of what to specialize on and create your own curriculum. For some of them you need to take some more courses first, whereas others can be studied only by reading papers and/or github libraries. Check daily arxiv feed, check recent papers on twitter/reddit, do literature search, implement your ideas etc.

It is curious to me if advanced high school students would be able to pass this curriculum and do research in a year?

Anyway, I hope I can get any feedback on my post. Thank you for reading.

[1] https://ocw.mit.edu/courses/mathematics/18-02sc-multivariable-calculus-fall-2010/index.htm

[2] https://ocw.mit.edu/courses/mathematics/18-06-linear-algebra-spring-2010/

[3] https://ocw.mit.edu/courses/mathematics/18-05-introduction-to-probability-and-statistics-spring-2014/

[4] http://cs229.stanford.edu/

[5] https://cs230.stanford.edu

[6] http://vision.stanford.edu/teaching/cs231n/

submitted by /u/Aran_Komatsuzaki
[link] [comments]

[D] 1,500 scientists lift the lid on reproducibility

Written on January 5, 2020. Posted in Reddit MachineLearning.

This article doesn’t seem to mention ML research but I’m sure that reproducibility is a major issue in ML research as well, from my own experience.

https://www.nature.com/news/1-500-scientists-lift-the-lid-on-reproducibility-1.19970

submitted by /u/AdmiralLunatic
[link] [comments]

[P] This K-pop idol does not exist (StyleGAN2)

Written on January 4, 2020. Posted in Reddit MachineLearning.

Hey everyone, I played around with the newly released StyleGAN2 recently and created a model that generates faces of K-pop idols.

Website: http://www.thiskpopidoldoesnotexist.xyz/

How I did it: https://medium.com/@hygzhu/this-k-pop-idol-does-not-exist-df2f095c795d

I’m a beginner to machine learning so there were probably many things I could have done better, but I was definitely surprised how fast StyleGAN2 was able to generate decent looking pictures in such little time. Perhaps it is because all these K-pop idols have a mostly homogeneous appearance?

submitted by /u/stresses_too_much
[link] [comments]

[N] New subreddit: r/LatestInML (daily updates)

Written on January 4, 2020. Posted in Reddit MachineLearning.

Hi folks,

Just created a subreddit for folks who who want to stay up-to-date with game changing developments in ML and AI one shouldn’t miss. Feel free to join and invite your friends and colleagues. 🙂

Happy New Year!

https://www.reddit.com/r/LatestInML/

submitted by /u/MLtinkerer
[link] [comments]

[P]Project Ideas

Written on January 4, 2020. Posted in Reddit MachineLearning.

Hi, everyone, please suggest some interesting projects for my ML portfolio. Thank in advance.

submitted by /u/imavijit
[link] [comments]

[D] How do you use unsupervised Learning methods with time-series data?

Written on January 4, 2020. Posted in Reddit MachineLearning.

I have a question about a problem that I am trying to solve.

I have clinical data (time-series measurements), and I aim to understand patients’ problems. Every measurement is reporting data in slightly different way depends on the behavior of the patient / equipement used to monitor patient.

This later presents three challenges:

1/ missing data for some measurements for some time.

2/ normalization problem. we don’t know have a clear idea on min/max of medical values (I assume it is hardly predictable in some cases).

3/ Since labeling such data is very costly. I can get some labeled data but it would be really a small subset.

What do I have?

For the sake of an example, let’s say that I have three measurements (measurement A, measurement B, measurement C).

I have time series of measurement A, B, C for healthy patients (they recovered and they are staying in hospital for few days), and I have time series of measurement A, B, C for patients who struggle with some problems.

I only know that information. The idea is to categorize patient problems over time and use it in other places where some specialized doctors lack expertise to identify problems. How can I approach this?

A	t1, t2,t3,<missing>,t5,t6
B	t1, t2,t3,t4,t5,t6
C	t1, t2,t3,X,t4,<missing>,t6

If I see these time series, I would say that it is patient is struggling with problem X

P.S: I have > hundered measurements.

Suggested approach

Since the three measurements don’t report data in the same time window, I averaged on time window T. I focused only on time series of sick patients. I tried a naive approach of apply clustering with temporal constraints. Since it;s a naive approach to the problem, I started looking/exploring other methods.

Questions: 1/ How can I leverage measurements of healthy patients (use it as a guide) and the little labeled data I have 2/ what are some of the methods that I can use for unsupervised learning to tag/cluster problems (doctor will later identify them)?

I am seeking advises/recommendations on methods to explore. Do you have any suggestions, ideas and papers to explore. I would be thankful.

submitted by /u/__Julia
[link] [comments]

Blog

Learn About Our Meetup

5000+ Members

MEETUPS

JOB POSTINGS

CONTACT

Author: torontoai

[D] Swift for TensorFlow is currently the best System for ML. CMV

[D] Time series analysis for machine employment support

[D] Is this idea at all feasible? Should I pursue it?

[D] AISTATS 2020 paper decision notificafion

[D] Streamlined ML curriculum to get from zero to research as quickly as possible

[D] 1,500 scientists lift the lid on reproducibility

[P] This K-pop idol does not exist (StyleGAN2)

[N] New subreddit: r/LatestInML (daily updates)

[P]Project Ideas

[D] How do you use unsupervised Learning methods with time-series data?