Project Description
I have a folder that receives multiple PDFs whose file-names carry key information separated by underscores. Typical segments include the PO number, Serial number, Date, Work order, Heat number, Calibration date and a few other identifiers.
I need a Python solution that will:
• Parse each file-name, isolate those data points, and write them into the specific fields of the document’s existing index (cover) page.
• Concatenate the newly updated PDFs, preserving their order, into a single master file.
• Drop that combined file into a fixed OneDrive or SharePoint location and return the cloud link.
The script should be callable from a Power App / Power Automate flow, so clean parameter input and a straightforward command-line or API trigger are important. Use any robust libraries you prefer (PyPDF2, pypdf, reportlab, etc.) as long as the final code is readable and fully commented.
Deliverables
1. Well-structured Python code with dependency list.
2. README that explains setup, variables for source & target folders, and how to hook the script into Power Apps.
3. A short demo (screenshots or video) proving the flow: file-name → field population → merged PDF in OneDrive/SharePoint.
Acceptance criteria will be successful extraction of all underscore-separated values into the correct index-page fields and creation of the merged file in the designated cloud folder without errors.