jrunner: use bulk copy (-b) for Postgres dests too (COPY)

Extend the -b wiring to jdbc:postgresql: dests, so DB2->PG (and any PG-dest) loads use jrunner's COPY FROM STDIN path instead of batched INSERTs. SQL Server already used -b (SQLServerBulkCopy); DB2 dests stay on INSERT. Update the CLAUDE.md bulk section accordingly. Validated DB2->PG COPY with real types (dates -> date col, decimals -> numeric, char) and null/empty-string fidelity. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-18 23:14:15 -04:00 · 2026-06-18 23:14:15 -04:00 · 0ddb636f14
commit 0ddb636f14
parent f4d8cd005d
2 changed files with 7 additions and 6 deletions
--- a/CLAUDE.md
+++ b/CLAUDE.md
@ -122,9 +122,9 @@ Watermarks are managed inline on both the module edit form and wizard step 3 (no
 Recreated on every run as `pipekit_staging.{module_name}` (DROP + CREATE, not IF NOT EXISTS). Ephemeral — exists only during the run.
-## Bulk Copy (SQL Server dest)
+## Bulk Copy
-`jrunner.migrate` passes jrunner's `-b` flag when the dest JDBC URL starts with `jdbc:sqlserver:`, so SQL Server loads stream via `SQLServerBulkCopy` (TDS bulk-load) instead of batched INSERTs — dramatically faster on large/wide tables (a 1.27M-row load went ~111 min → ~4 min). DB2/PG dests keep the INSERT path. This is automatic per-dest; no module config. (Requires jrunner with `-b` support.) Note: jrunner only streams the **Postgres source** without buffering it all into memory because it sets `autoCommit(false)` on the source connection in migration mode — a PG-driver requirement for `setFetchSize` to take effect.
+`jrunner.migrate` passes jrunner's `-b` flag when the dest is SQL Server (`jdbc:sqlserver:`) or Postgres (`jdbc:postgresql:`), so loads use the dest's native bulk path instead of batched `INSERT…VALUES` — **SQL Server** via `SQLServerBulkCopy` (TDS bulk-load), **Postgres** via `COPY … FROM STDIN`. Dramatically faster on large/wide tables (a 1.27M-row SQL Server load went ~111 min → ~4 min). DB2 dests keep the INSERT path. Automatic per-dest; no module config. (Requires jrunner with `-b` support.) Note: jrunner only streams the **Postgres source** without buffering it all into memory because it sets `autoCommit(false)` on the source connection in migration mode — a PG-driver requirement for `setFetchSize` to take effect.
 ## Scheduler
--- a/pipekit/jrunner.py
+++ b/pipekit/jrunner.py
@ -171,10 +171,11 @@ def migrate(
            argv.append("-t")
        if clear:
            argv.append("-c")
-        # SQL Server dest: stream via TDS bulk copy instead of INSERT...VALUES
+        # Use jrunner's native bulk load instead of INSERT...VALUES round trips
-        # round trips (much faster on wide/large tables). jrunner -b is a no-op
+        # (much faster on wide/large tables): SQL Server -> SQLServerBulkCopy,
-        # for non-SQL-Server dests, but only pass it where it applies.
+        # Postgres -> COPY FROM STDIN. Only pass -b where jrunner supports it.
-        if (dest_conn.get("jdbc_url") or "").lower().startswith("jdbc:sqlserver:"):
+        _durl = (dest_conn.get("jdbc_url") or "").lower()
        if _durl.startswith("jdbc:sqlserver:") or _durl.startswith("jdbc:postgresql:"):
            argv.append("-b")
        proc = subprocess.Popen(argv, stdout=subprocess.PIPE, stderr=subprocess.PIPE,