Learn About Our Meetup

4500+ Members

[D] Can I use to calculate new features as part of a pipeline, or should this be done before using the module?

I am just curious about how much of the data processing process I can refactor into a pipeline for inputting my data into my model. My source data is used to calculate different features to create a dataset, and then this dataset is processed further for inputting into my models. So the process is basically like this:

Source Data (structured JSON which just has text fields for data parsed from a raw document) —>
Dataset (this fields are used to calculate numerical features, categorical features, and sequence features) —>
Processed Dataset (standard techniques – scaling, encoding, tokenization, padding, etc.)

And then I have my input data for the model. I am wondering whether I can refactor this entire process into a pipeline, or will the pipeline only handle the processing done in the second step described above? I am using TF 2.0 Beta by the way.

Any insights or help will be greatly appreciated.

submitted by /u/that_one_ai_nerd
[link] [comments]

Next Meetup




Plug yourself into AI and don't miss a beat