[Discussion] What is the minimum amount of cloud computing knowledge necessary for a data scientist?
I am a data scientist with pretty deep domain knowledge and a broad and comprehensive knowledge of statistics and ML algorithms. I can go toe-to-toe with the best over when to use neural networks and when to use logistic regression, why the BIC is sometimes better than the AIC, which optimization algorithm is best suited for a specific optimization problem etc….
Unfortunately, my tech stack has been all on-prem and ERP tools so far. My knowledge of cloud computing amounts to knowing how to log into a cloud environment and then doing whatever I would do on a on-prem server on the cloud instead (e.g. running SQL queries in Redshit or BigQuery the same way I would run them on any Oracle or SQL Server tool, running Python scripts the same way I run them on my local machine or on a Linux box, S3 and GS are just a bunch of remote folders that you scp to and from, etc….)
I feel that I need to learn more. Most people keep telling me that I am fine, as long as I am sticking to the role of data scientist and not trying to do ML engineer or data engineer work instead, but I have a nagging feeling that there is a minimum amount of cloud computing knowledge that I should have and that I am missing.
Can anybody give me some pointers?