Getting the current PDFs names from a Windows Explorer folder & updating them via Python

This is not a GIS-related post, but since I might reuse it in the future, I decided to register it here.

Lately, I have been downloading GIS and IT-related resources from the web (mainly PDFs). I had hundreds of them, and I wanted to change the names of many of those documents by avoiding manual editing of each file. I instead used a mapped file as support for the update.

Introduction

The next script can easily be run via the Python Interactive Terminal at C:\Users\Username\AppData\Local\Programs\ArcGIS\Pro\bin\Python\Scripts\propy.bat

It can also be run using the IDLE for ArcGIS Pro (Run module) or by many other means.

All the PDFs are placed in a folder of choice on a Windows machine.

Script 1

  1. Parses the designated folder to retrieve the list of PDFs.
  2. If there are any subfolders containing PDFs, it doesn’t take them into account.
  3. Since we want the list of PDFs to be a list (each row of the .txt file is equivalent to a .pdf file name), it does that.
import os

folder = r"C:\Users\Username\Desktop\Items"   # <- adjust if needed

pdfs = [
    file for file in os.listdir(folder)
    if file.lower().endswith(".pdf") and os.path.isfile(os.path.join(folder, file))
]

for file in pdfs:
    print(file)

Post work after running Script 1

After that, we paste the printed .txt list of PDFs into Microsoft Excel.

We create a mapping between the old file name and the new one that we want (the first row in the Excel file will not be taken into account by the script -script 2-).

We save/export the Excel to .csv (UTF-8). In this case, the CSV is semicolon separated.

And we use the new script (Script 2) to rename the PDF file names.

Script 2

  1. Locates and reads the current PDFs from the Windows Explorer folder.
  2. Parses the mapping_csv and applies the changes where needed.

The .csv file looks like the following:

Current name;New name
1009mnging_arcsde_applictn_svrs.pdf;Managing ArcSDE application servers.pdf
ArcGISImageDedicatedUserManual.pdf;ArcGIS Image dedicated user manual.pdf
Chapter10_notes.pdf;Creating and maintaining geographic databases.pdf

The script:

import os
import csv

folder = r"C:\Users\Username\Desktop\Items"
mapping_csv = "C:\Users\Username\Desktop\Items\Mapping-oldname-newname.csv"

with open(mapping_csv, newline='', encoding="utf-8") as file:
    reader = csv.reader(file, delimiter=";")

    # skip header if you have one (remove this line if not)
    # next(reader, None)

    for row in reader:
        # old name in column 1
        old_name = row[0].strip() if len(row) > 0 else ""

        # new name in column 2 (may be empty)
        new_name = row[1].strip() if len(row) > 1 else ""

        if not old_name:
            continue

        # if new name is empty -> leave unchanged
        if not new_name:
            print(f"Skipping (no new name): {old_name}")
            continue

        # ensure .pdf extension
        if not new_name.lower().endswith(".pdf"):
            new_name += ".pdf"

        old_path = os.path.join(folder, old_name)
        new_path = os.path.join(folder, new_name)

        if os.path.exists(old_path):
            if os.path.exists(new_path):
                print(file"Not renaming (target exists): {new_name}")
            else:
                os.rename(old_path, new_path)
                print(f"Renamed: {old_name} -> {new_name}")
        else:
            print(file"File not found: {old_name}")


Discover more from My GIS Notebook

Subscribe to get the latest posts sent to your email.

Leave a comment