Blog

Learn About Our Meetup

4500+ Members

[D] Can I use tf.data to calculate new features as part of a pipeline, or should this be done before using the tf.data module?

I am just curious about how much of the data processing process I can refactor into a tf.data pipeline for inputting my data into my model. My source data is used to calculate different features to create a dataset, and then this dataset is processed further for inputting into my models. So the process is basically like this:

Source Data (structured JSON which just has text fields for data parsed from a raw document) —>
Dataset (this fields are used to calculate numerical features, categorical features, and sequence features) —>
Processed Dataset (standard techniques – scaling, encoding, tokenization, padding, etc.)

And then I have my input data for the model. I am wondering whether I can refactor this entire process into a tf.data pipeline, or will the tf.data pipeline only handle the processing done in the second step described above? I am using TF 2.0 Beta by the way.

Any insights or help will be greatly appreciated.

submitted by /u/that_one_ai_nerd
[link] [comments]

Next Meetup

 

Days
:
Hours
:
Minutes
:
Seconds

 

Plug yourself into AI and don't miss a beat