Sourcing data for a job recommendation system [research]
I’m an undergraduate data scientist, about to start work on my dissertation project.
I thought I’d create a system that, given someone’s career history and education, predicts what job they’re likely to get, and at what company. Essentially this is to help focus the efforts of job seekers, and help them get to where they belong.
Originally I planned to do this by scraping data from LinkedIn profiles. From the LinkedIn profile, you can obtain information about someone’s current job and employer, as well as their career history and education. Therefore you can see what education and career history (the input) resulted in their current job (the output – the thing I’m trying to predict).
However, with this strategy, I’m running into ethical problems and data protection problems. There’s a good chance my project proposal won’t pass the ethical review. So I’m looking for a new data source without these issues.
I’m pretty new to machine learning, so it’s hard for me to assess what sources of data are viable for this project. Therefore, I was hoping someone more experienced can suggest how I might obtain the data I’m after, without having ethical baggage? Or failing that, at least a hint or pointer would be greatly appreciated.