[D] Legality of Scraping Training Data from Google Images
I think my original post was removed because I didn’t tag it.
I have a project in mind. I want to build an image classifier with novel classes. For example, lets say I want to classify images of different types of bicycles. Google images is ripe with these images for each type of bike.
I want to publish a blog post about my project, and put my code (including scraper) on github but not upload the image files anywhere. I might put up a (free) endpoint hosting my resulting classifier if it works.
Questions:
- Are all images on google images fair game for training data or do I have to limit it to images “labelled for reuse”?
- Do I have to cite the images I use as training data?
- I’ve read about “fair use”, how does that figure in here?
Thanks, and sorry if this has been covered elsewhere
submitted by /u/am_i_having_fun
[link] [comments]