[D] I started writing a book on practical considerations of ML, keen for feedback on its direction

Written by torontoai on August 30, 2019. Posted in Reddit MachineLearning.

I was finding I was constantly having the same conversations with people about implementing ML in practice, so I tried to find a resource I could provide to people that might help. However, I found there wasn’t much on the practical side of implementing ML – so about a year ago I drafted out a table of contents and started writing. I ended up shelving it for a bit, and have just picked it up again now – but am torn between just blogging what I’ve already got, or trucking on to create a unified resource (i.e. the book).

I’m keen to hear what the reddit ML community thinks – whether I should continue (maybe it’s been superceded?), and if I continue, if there’s anything you’d like to see covered in the book?

My intention – should I continue and complete it – is to self-publish online through something like LeanPub. I have no burning desire to see the book in print or make money off it, I’d really just like to raise awareness of what we all need to think about when we create ML solutions in the real world.

This is me: https://twitter.com/drkatnz

Here’s how the table of contents looks (about ~25% of the content is written already, and subsections aren’t shown. Feedback so far has been to include a section on biases, which has been added):

Introduction
1.1 Terminology
1.2 How do I get started using machine learning?
Do you really need machine learning?
2.1 Data availability
2.2 Liability
2.3 Capability
2.4 Other solutions
2.5 Pre-requisite checklist
Team
3.1 Skills
3.2 Common team structures
3.3 Forming a team and getting started
Building your first machine learning solution
Data collection
5.1 Collecting the data
5.2 Data set size – how much is enough?
5.3 Labeled versus unlabeled data
Pre-processing
6.1 Automatically cleaning the data
6.2 Dealing with missing values
6.3 Applying domain knowledge
6.4 Feature cleanup
6.5 Dealing with the minority class
Algorithm considerations
7.1 Unsupervised versus supervised
7.2 ’Good enough’ accuracy
7.3 Storage
7.4 Speed
Measuring accuracy
8.1 Metrics
8.2 Minimum required accuracy
8.3 Test set
8.4 Investigating prediction errors
8.5 A/B Testing
Identifying and Mitigating Biases
9.1 Biases from data
9.2 Biases from trained models
9.3 Inventor’s bias
9.4 Biases caused by perception of machine learning

10 Getting an algorithm to production
10.1 Infrastructure
10.2 Documentation
10.3 User interface
10.4 Abstaining classifiers
10.5 Runtime environment

Managing live algorithms
11.1 Monitoring
11.2 Effect on the real world
11.3 Auditing results
11.4 Updating models
11.5 Technical debt

What do y’all think?

submitted by /u/katnz
[link] [comments]

Blog

Learn About Our Meetup

5000+ Members

MEETUPS

JOB POSTINGS

CONTACT

[D] I started writing a book on practical considerations of ML, keen for feedback on its direction