File Operations Intermediate Updated: 2026-04-26

Sort folder files into category subfolders automatically

A Python script that scans a folder and automatically sorts files into nine category subfolders (Images, PDF, Excel, Videos, and so on) based on file extension. Same-name collisions are resolved with a timestamp suffix, subfolder recursion is supported, and you can keep originals via copy mode. A summary TXT is written every run so you can audit what moved where.

Is your Downloads folder a graveyard of mixed extensions? This script walks the target folder, classifies each file by extension into one of nine categories (Images / PDF / Excel / Videos / ...), and routes it into a matching subfolder. Pick `Move` to physically reorganize, or `Copy and keep` to preserve the originals.

What this script can do

  • 9-category classification by extension (Images, PDF, Excel, Word, PowerPoint, Videos, Audio, Text, Archives)
  • Switchable between move and copy-keep modes
  • Automatic timestamp suffix on same-name collisions
  • Subfolder recursion with auto-exclude of already-sorted folders
  • Auto-generated summary TXT per run (category totals + detail log table)
Download file-sorter.pybes

Import the .pybes file into Pybes and the script — along with its config fields — loads automatically.

Config fields

These are the config fields this script uses. Enter values through the Pybes GUI at runtime.

target_dir Folder Required

Target folder

Folder containing the files you want to sort

Default: C:\Users\<your-name>\Downloads

include_sub Checkbox Required

Include subfolders

When ON, files inside subfolders are also picked up

Default: false

file_mode Dropdown Required

Source file handling

`Move` removes the file from its original location. `Copy and keep` preserves the original.

Options: Move, Copy and keep

Default: Move

Code walkthrough

import sys
import json
import os
import shutil
from datetime import datetime

with open(sys.argv[1], encoding="utf-8") as f:
    inputs = json.load(f)

target_dir = inputs["target_dir"]
include_sub = inputs["include_sub"] == "true"
file_mode = inputs["file_mode"]

CATEGORY_MAP = {
    "Images":     {".jpg", ".jpeg", ".png", ".gif", ".bmp", ".tiff", ".tif", ".webp", ".svg", ".ico", ".heic", ".raw"},
    "PDF":        {".pdf"},
    "Excel":      {".xlsx", ".xls", ".xlsm", ".xlsb", ".csv"},
    "Word":       {".docx", ".doc", ".dotx"},
    "PowerPoint": {".pptx", ".ppt", ".potx"},
    "Videos":     {".mp4", ".mov", ".avi", ".mkv", ".wmv", ".flv", ".webm", ".m4v"},
    "Audio":      {".mp3", ".wav", ".aac", ".flac", ".m4a", ".ogg", ".wma"},
    "Text":       {".txt", ".md", ".log"},
    "Archives":   {".zip", ".rar", ".7z", ".tar", ".gz", ".lzh"},
}

SKIP_DIRS = set(CATEGORY_MAP.keys()) | {"Other", "Summary"}

def get_category(ext):
    ext = ext.lower()
    for category, exts in CATEGORY_MAP.items():
        if ext in exts:
            return category
    return "Other"

def resolve_dest(dest_dir, filename):
    dest_path = os.path.join(dest_dir, filename)
    if not os.path.exists(dest_path):
        return dest_path
    name, ext = os.path.splitext(filename)
    ts = datetime.now().strftime("%Y%m%d_%H%M%S")
    return os.path.join(dest_dir, f"{name}_{ts}{ext}")

def collect_files(base_dir, include_sub):
    result = []
    base_abs = os.path.abspath(base_dir)
    if include_sub:
        for root, dirs, files in os.walk(base_dir):
            root_abs = os.path.abspath(root)
            if root_abs == base_abs:
                dirs[:] = [d for d in dirs if d not in SKIP_DIRS]
            for fname in files:
                result.append(os.path.join(root, fname))
    else:
        for fname in os.listdir(base_dir):
            fpath = os.path.join(base_dir, fname)
            if os.path.isfile(fpath):
                result.append(fpath)
    return result

try:
    print(f"Target folder: {target_dir}")
    print(f"Include subfolders: {'Yes' if include_sub else 'No'}")
    print(f"Source file mode: {file_mode}")
    print("---")

    files = collect_files(target_dir, include_sub)
    total = len(files)
    print(f"File count: {total}")

    summary = {}
    action_log = []

    for i, src_path in enumerate(files):
        fname = os.path.basename(src_path)
        _, ext = os.path.splitext(fname)
        category = get_category(ext)

        dest_dir = os.path.join(target_dir, category)
        os.makedirs(dest_dir, exist_ok=True)

        dest_path = resolve_dest(dest_dir, fname)

        if file_mode == "Move":
            shutil.move(src_path, dest_path)
            action = "Move"
        else:
            shutil.copy2(src_path, dest_path)
            action = "Copy"

        summary.setdefault(category, []).append(fname)
        action_log.append((src_path, category, dest_path))

        print(f"  [{i+1}/{total}] {action}: {fname}{category}/")

    # Generate summary TXT (per-category totals + detail log table)
    ts_now = datetime.now().strftime("%Y%m%d_%H%M%S")
    summary_dir = os.path.join(target_dir, "Summary")
    os.makedirs(summary_dir, exist_ok=True)
    summary_path = os.path.join(summary_dir, f"sort_summary_{ts_now}.txt")

    # Compute table column widths
    col_src      = max((len(src)      for src, _, _      in action_log), default=10)
    col_category = max((len(category) for _, category, _ in action_log), default=8)
    col_dest     = max((len(dest)     for _, _, dest     in action_log), default=10)
    col_src      = max(col_src,      len("Source path"))
    col_category = max(col_category, len("Category"))
    col_dest     = max(col_dest,     len("Destination path"))

    sep    = f"+{'-' * (col_src + 2)}+{'-' * (col_category + 2)}+{'-' * (col_dest + 2)}+"
    header = f"| {'Source path':<{col_src}} | {'Category':<{col_category}} | {'Destination path':<{col_dest}} |"

    with open(summary_path, encoding="utf-8", mode="w") as f:
        f.write("Sort Summary\n")
        f.write(f"Run datetime: {datetime.now().strftime('%Y/%m/%d %H:%M:%S')}\n")
        f.write(f"Target folder: {target_dir}\n")
        f.write(f"Include subfolders: {'Yes' if include_sub else 'No'}\n")
        f.write(f"Source file mode: {file_mode}\n")
        f.write(f"Total: {total}\n")
        f.write("=" * 40 + "\n\n")

        # Per-category totals
        f.write("[Category totals]\n")
        for category in sorted(summary.keys()):
            fnames = summary[category]
            f.write(f"  {category}: {len(fnames)}\n")
        f.write("\n")

        # Detail log (table)
        f.write("[Detail log]\n")
        f.write(sep + "\n")
        f.write(header + "\n")
        f.write(sep + "\n")
        for src, category, dest in action_log:
            f.write(f"| {src:<{col_src}} | {category:<{col_category}} | {dest:<{col_dest}} |\n")
        f.write(sep + "\n")

    print("---")
    print(f"Done. Total: {total}")
    for category, fnames in sorted(summary.items()):
        print(f"  {category}: {len(fnames)}")
    print(f"Summary: {summary_path}")

except Exception as e:
    print(f"Error occurred: {e}", file=sys.stderr)
L1–12

Loads four standard-library modules and reads the JSON config Pybes passes via sys.argv[1] into the inputs dict. Checkbox values arrive as the strings "true" / "false", so == "true" converts them to bool. The select value flows straight into file_mode as a string and is later compared against "Move" to branch behavior.

L14–33

CATEGORY_MAP defines the category → extension set mapping, and SKIP_DIRS lists the folders the script itself creates (Images, PDF, ..., Other, Summary) so they can be excluded from recursion. get_category lower-cases the extension and walks the dict to find a match, falling back to Other if nothing fits. Sets are used because the in check is O(1).

L35–58

resolve_dest is a helper that returns a timestamp-suffixed path on same-name collisions. collect_files recurses with os.walk when include_sub=True, and at the root level it prunes SKIP_DIRS from the subfolder list — preventing the script from re-traversing its own destination folders forever. The dirs[:] = [...] slice assignment is the canonical idiom for pruning a subtree from os.walk.

L60–144

Main loop. Each file is moved (shutil.move) or copied (shutil.copy2) into its category folder. Every run also writes a Summary/sort_summary_YYYYMMDD_HHMMSS.txt containing per-category totals and a source-to-destination table. copy2 is used (not copy) because it preserves modified-time metadata; copy would lose that.

How it works

How extensions are mapped to categories

CATEGORY_MAP defines a category name → extension set table, and get_category lower-cases the extension and walks the dict to find a match. Anything that does not match falls back to Other. The values are sets because in checks are O(1). Adding a new category is a one-line dict entry.

Subfolder traversal without infinite recursion

os.walk recurses naturally, but the script's own destination folders (Images/, PDF/, ...) sit right under the target folder — without filtering, every run would re-traverse them and shuffle the same files forever. The line dirs[:] = [d for d in dirs if d not in SKIP_DIRS] rewrites the dirs list in-place at the root level, which is os.walk's contract for pruning subtrees from further traversal.

Resolving same-name collisions

If the destination already has a file with the same name, shutil.move would silently overwrite it. resolve_dest checks with os.path.exists and, on collision, returns a path like original_YYYYMMDD_HHMMSS.ext. Two report.pdf files from different source folders both survive.

Audit trail via the summary TXT

Every run writes Summary/sort_summary_YYYYMMDD_HHMMSS.txt containing per-category totals and a source → destination table. When you ask yourself five minutes later, "where did that file go?", a Ctrl+F in the summary answers it.

Customization

Add a new category

CATEGORY_MAP accepts one-row additions like "Fonts": {".ttf", ".otf", ".woff", ".woff2"}. SKIP_DIRS is rebuilt from CATEGORY_MAP.keys() automatically, so the new folder is excluded from recursion without a second edit.

Add a preview (dry-run) mode

Before the if file_mode == "Move": block, add an elif file_mode == "Preview": branch that does print(f"[DRY] {src_path} → {dest_path}") instead of moving or copying. Add "Preview" to the field's options list to expose it in the UI.

Sort by modified date instead of extension

For photo-style year/month buckets, drop get_category and replace dest_dir with os.path.join(target_dir, datetime.fromtimestamp(os.path.getmtime(src_path)).strftime("%Y/%m")). Useful for camera dumps.

Troubleshooting

PermissionError WinError 32: file is in use by another process

Excel, a PDF reader, or an image viewer is holding the file open and shutil.move cannot proceed. Close the apps (and the Explorer preview pane, which counts as a holder on Windows) and rerun. Files processed before the failure are already sorted — check the summary TXT to see what is left.

FileNotFoundError WinError 2: cannot find the path

Either the target_dir path is wrong, or backslashes were not escaped properly when typed by hand. Re-pick the folder via the Pybes folder picker to be safe. For network drives, confirm the share is reachable before running.

Files I did not want to move (executables, shortcuts) ended up sorted

By default any unmapped extension is swept into Other. To skip them, return None from get_category for those extensions (if ext in {".exe", ".lnk"}: return None) and add a if category is None: continue guard at the top of the main loop.

What if two files collide within the same run?

Timestamps are second-resolution, so two collisions inside the same second would let the second file overwrite the first. In practice this is unlikely, but you can append a counter (_{i}) inside resolve_dest for paranoia.

FAQ

Is the original subfolder structure preserved when subfolder recursion is on?

No. With Include subfolders = true, only the **files** inside subfolders are picked up — they all get flattened into the category folders directly under the target folder. To preserve the hierarchy, build dest_dir like os.path.join(target_dir, category, os.path.relpath(os.path.dirname(src_path), target_dir)) instead.

Why is the SKIP_DIRS filter only applied at the top level?

Because the destination folders (Images/, PDF/, ...) and Summary/ are always created directly under the target folder. If you happen to have a coincidentally-named folder like Images deep in the tree, that one is treated as a regular subfolder and its files get sorted normally.

Why is .csv grouped under Excel rather than Text?

Intentional — in practice CSV files are opened and edited in Excel far more often than in a text editor, so colocating them with Excel files makes them easier to find. To switch, remove ".csv" from CATEGORY_MAP["Excel"] and add it to "Text".

What happens if I rerun with copy mode in the same folder?

Originals stay put, so the category folders gain another full set of copies (collisions get timestamp suffixes). To avoid mirroring the folder over and over, use Move mode for the first run, or manually delete the originals after the copy run completes.

See more common questions →
Download file-sorter.pybes

Import the .pybes file into Pybes and the script — along with its config fields — loads automatically.