From a61e01893270720af093883c5460edd21057281a Mon Sep 17 00:00:00 2001 From: Paul Trowbridge Date: Thu, 18 Jun 2026 22:50:58 -0400 Subject: [PATCH] docs: document the -b bulk copy path (readme + CLAUDE) Add the -b flag to the readme/CLAUDE flag lists and describe the bulk copy migration sub-mode: SQLServerBulkCopy via the BulkSource adapter (SQL Server dest only), why numeric/string-ish types route through NVARCHAR, the ~111min -> ~4min win, and the 10k-row live progress counter. Co-Authored-By: Claude Opus 4.8 --- CLAUDE.md | 22 ++++++++++++++++++---- readme.md | 3 +++ 2 files changed, 21 insertions(+), 4 deletions(-) diff --git a/CLAUDE.md b/CLAUDE.md index 37a0244..1509fd2 100644 --- a/CLAUDE.md +++ b/CLAUDE.md @@ -58,9 +58,21 @@ The tool operates in two modes: **Migration Mode** (original functionality): - Activates when destination flags are provided -- Reads from source, writes to destination with batched INSERTs +- Reads from source, writes to destination with batched INSERTs (or bulk copy with `-b`) - Shows progress counters and timing information +**Bulk Copy** (migration mode, `-b`, SQL Server dest only): +- Streams the source ResultSet into SQL Server over the TDS bulk-load protocol + via `SQLServerBulkCopy` — no per-batch INSERT round trips. Far faster on + large/wide tables (a 1.27M-row, ~298-col load went ~111 min → ~4 min). +- A `BulkSource` adapter (`ISQLServerBulkData`) maps source type names to JDBC + types we control. String-ish types (text/varchar/char/bpchar/json/jsonb/uuid + **and numeric**) are declared NVARCHAR and read via `getString` so SQL Server + converts losslessly — numeric goes this route because PG reports unconstrained + numeric as scale 0, which a typed DECIMAL path would round (123.45 → 123). +- Emits a `\r`-counter every 10k rows for live progress, and prints the final + row count. Falls back to the INSERT path for non-SQL-Server dests. + ### Data Flow **Query Mode:** @@ -76,9 +88,10 @@ The tool operates in two modes: 2. Read SQL query from file specified by -sq flag 3. Connect to source and destination databases via JDBC 4. Execute source query and fetch results (fetch size: 10,000 rows) -5. Build batched INSERT statements (250 rows per batch) -6. Execute batches against destination table specified by -dt flag -7. Optionally clear target table before insert if -c flag is set +5. Optionally clear target table before insert if -c flag is set +6. With `-b` (SQL Server dest): bulk-copy the ResultSet via `SQLServerBulkCopy`. + Otherwise: build batched INSERT statements (250 rows per batch) and execute + them against the destination table specified by -dt ### Type Handling The tool includes explicit handling for different SQL data types in a switch statement (migration mode). Quoted string types: VARCHAR/NVARCHAR, TEXT/NTEXT, CHAR/NCHAR, CLOB/NCLOB, and the PostgreSQL string-ish types JSON, JSONB, BPCHAR (PG `char(n)`), and UUID. Date/time types (DATE, TIME, TIMESTAMP/DATETIME variants) are also quoted. String types get quote escaping (`'` → `''`) and optional trimming. @@ -109,6 +122,7 @@ Command-line flags: - `-dt` - fully qualified destination table name (migration mode only) - `-t` - trim text fields (default: true) - `-c` - clear target table before insert (default: true, migration mode only) +- `-b` - bulk copy into dest via SQLServerBulkCopy (migration mode, SQL Server dest only) - `-f` - output format: csv, tsv (query mode only, default: csv) ## Key Implementation Details diff --git a/readme.md b/readme.md index 9dc1654..9a6d8a0 100644 --- a/readme.md +++ b/readme.md @@ -187,4 +187,7 @@ jrunner -scu jdbc:postgresql://source:5432/sourcedb \ **Options:** - `-t` - trim text fields (default: true) - `-c` - clear target table before insert (default: true) +- `-b` - bulk copy into the destination (SQL Server dest only); streams via the + TDS bulk-load protocol instead of batched INSERTs — far faster on large/wide + tables. No-op for non-SQL-Server dests. (migration mode only) - `-f` - output format: csv, tsv (query mode only, default: csv)