Skip to main content

Blog

Learn About Our Meetup

5000+ Members

MEETUPS

LEARN, CONNECT, SHARE

Join our meetup, learn, connect, share, and get to know your Toronto AI community. 

JOB POSTINGS

INDEED POSTINGS

Browse through the latest deep learning, ai, machine learning postings from Indeed for the GTA.

CONTACT

CONNECT WITH US

Are you looking to sponsor space, be a speaker, or volunteer, feel free to give us a shout.

[D] Version Control for Data Science — Tracking Machine Learning Models and Datasets with DVC

Unlike usual software dev projects, ML projects have additional huge files like datasets, trained models, label-encodings etc. which can easily go to the size of a few GBs and therefore cannot be tracked using Git.

The article explains how DVC (Data Version Control) tool helps us to version large data files, similar to how we version control source code files using Git and how we can track all the artifacts with DVC — which will make the workflow a lot more productive, as we don’t have to manually keep track of what we did to achieve the state, and also we don’t lose time in the processing of data and building models to reproduce the same state: Version Control for Data Science — Tracking Machine Learning Models and Datasets

submitted by /u/thumbsdrivesmecrazy
[link] [comments]