Long-form PDF document processing#
For long-form PDF document processing, see company_research folder, which contains:
* document_sourcing.py (.ipynb): example of downloading PDF documents from the web;
* document_processing.py (.ipynb): example of triggering text and quants extraction from PDF documents;
* data_extraction.py (.ipynb): example of extracted data related to specific keyphrases/metrics from documents, and re-structuring that data into a desired form.