[R] Principled Machine Learning for Efficient Collaboration

Written by torontoai on June 19, 2019. Posted in Reddit MachineLearning.

Machine learning projects are often harder than they should be. We’re just running software, and the result is a trained ML model. But three months later do you remember how to rerun the software, the datasets may have changed, and therefore you might be unable to replicate the results. A lack of software tools to manage machine learning datasets is the culprit, and impede efforts to efficiently share of data with colleagues.

In our search for tools to efficiently manage machine learning projects these principles are important:

Transparency: Inspecting every part of the ML project
Audibility: Inspecting all intermediate results, and the final result
Reproducibility: Ability to robustly rerun the software and associated datasets from any stage in the project
Scalability: Ability to support ML projects containing any number of people, and to work on multiple projects at a time

The article explains implementation in ML projects and using some open source tools like MLFlow and DVC in this context: Principled Machine Learning – DEV Community

submitted by /u/thumbsdrivesmecrazy
[link] [comments]

Blog

Learn About Our Meetup

5000+ Members

MEETUPS

JOB POSTINGS

CONTACT

[R] Principled Machine Learning for Efficient Collaboration