[P] Using tf-idf to analyse economic documents
This is a recent analysis I conducted on economic bulletins from the ECB in pdf format. pdf2txt was used to convert into text format, the text was appropriately processed in Python, and then tf-idf was used to rank terms which were then incorporated into a word cloud. The intention behind this is to extract key terms from a document quickly, e.g. tariffs, downturn, debt, etc.
Would appreciate your opinions!
submitted by /u/plentyofnodes
[link] [comments]