docs: -b now covers Postgres (COPY) as well as SQL Server
Update readme + CLAUDE: -b is no longer SQL-Server-only. Describe the Postgres COPY FROM STDIN path (CopyManager, text-based, CSV-quoted, empty vs NULL) next to the existing SQL Server SQLServerBulkCopy path; DB2 still falls back to INSERT. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This commit is contained in:
parent
6fe2bea089
commit
2ced7810d9
39
CLAUDE.md
39
CLAUDE.md
@ -61,17 +61,26 @@ The tool operates in two modes:
|
||||
- Reads from source, writes to destination with batched INSERTs (or bulk copy with `-b`)
|
||||
- Shows progress counters and timing information
|
||||
|
||||
**Bulk Copy** (migration mode, `-b`, SQL Server dest only):
|
||||
- Streams the source ResultSet into SQL Server over the TDS bulk-load protocol
|
||||
via `SQLServerBulkCopy` — no per-batch INSERT round trips. Far faster on
|
||||
large/wide tables (a 1.27M-row, ~298-col load went ~111 min → ~4 min).
|
||||
- A `BulkSource` adapter (`ISQLServerBulkData`) maps source type names to JDBC
|
||||
types we control. String-ish types (text/varchar/char/bpchar/json/jsonb/uuid
|
||||
**and numeric**) are declared NVARCHAR and read via `getString` so SQL Server
|
||||
converts losslessly — numeric goes this route because PG reports unconstrained
|
||||
numeric as scale 0, which a typed DECIMAL path would round (123.45 → 123).
|
||||
- Emits a `\r`-counter every 10k rows for live progress, and prints the final
|
||||
row count. Falls back to the INSERT path for non-SQL-Server dests.
|
||||
**Bulk Copy** (migration mode, `-b`) — uses the dest's native bulk path; falls
|
||||
back to the INSERT path for any other dest (e.g. DB2):
|
||||
|
||||
*SQL Server dest* — streams the source ResultSet over the TDS bulk-load
|
||||
protocol via `SQLServerBulkCopy` (no per-batch INSERT round trips; a 1.27M-row,
|
||||
~298-col load went ~111 min → ~4 min). A `BulkSource` adapter
|
||||
(`ISQLServerBulkData`) maps source type names to JDBC types we control:
|
||||
string-ish types (text/varchar/char/bpchar/json/jsonb/uuid **and numeric**) are
|
||||
declared NVARCHAR and read via `getString` so SQL Server converts losslessly —
|
||||
numeric goes this route because PG reports unconstrained numeric as scale 0,
|
||||
which a typed DECIMAL path would round (123.45 → 123).
|
||||
|
||||
*Postgres dest* — streams via `COPY <table> FROM STDIN WITH (FORMAT csv)` using
|
||||
the JDBC `CopyManager`. COPY is text-based, so the server parses each field into
|
||||
the column type — no per-type handling. Every non-null value is CSV-quoted
|
||||
(empty string stays distinct from NULL, which is an empty unquoted field); rows
|
||||
flush in 1000-row buffers.
|
||||
|
||||
Both emit a `\r`-counter every 10k rows for live progress and print the final
|
||||
row count.
|
||||
|
||||
### Data Flow
|
||||
|
||||
@ -89,9 +98,9 @@ The tool operates in two modes:
|
||||
3. Connect to source and destination databases via JDBC
|
||||
4. Execute source query and fetch results (fetch size: 10,000 rows)
|
||||
5. Optionally clear target table before insert if -c flag is set
|
||||
6. With `-b` (SQL Server dest): bulk-copy the ResultSet via `SQLServerBulkCopy`.
|
||||
Otherwise: build batched INSERT statements (250 rows per batch) and execute
|
||||
them against the destination table specified by -dt
|
||||
6. With `-b`: bulk-load via the dest's native path (SQL Server → `SQLServerBulkCopy`,
|
||||
Postgres → `COPY FROM STDIN`). Otherwise: build batched INSERT statements
|
||||
(250 rows per batch) and execute them against the destination table (-dt)
|
||||
|
||||
### Type Handling
|
||||
The tool includes explicit handling for different SQL data types in a switch statement (migration mode). Quoted string types: VARCHAR/NVARCHAR, TEXT/NTEXT, CHAR/NCHAR, CLOB/NCLOB, and the PostgreSQL string-ish types JSON, JSONB, BPCHAR (PG `char(n)`), and UUID. Date/time types (DATE, TIME, TIMESTAMP/DATETIME variants) are also quoted. String types get quote escaping (`'` → `''`) and optional trimming.
|
||||
@ -122,7 +131,7 @@ Command-line flags:
|
||||
- `-dt` - fully qualified destination table name (migration mode only)
|
||||
- `-t` - trim text fields (default: true)
|
||||
- `-c` - clear target table before insert (default: true, migration mode only)
|
||||
- `-b` - bulk copy into dest via SQLServerBulkCopy (migration mode, SQL Server dest only)
|
||||
- `-b` - bulk load into dest (migration mode): SQL Server via SQLServerBulkCopy, Postgres via COPY
|
||||
- `-f` - output format: csv, tsv (query mode only, default: csv)
|
||||
|
||||
## Key Implementation Details
|
||||
|
||||
@ -187,7 +187,8 @@ jrunner -scu jdbc:postgresql://source:5432/sourcedb \
|
||||
**Options:**
|
||||
- `-t` - trim text fields (default: true)
|
||||
- `-c` - clear target table before insert (default: true)
|
||||
- `-b` - bulk copy into the destination (SQL Server dest only); streams via the
|
||||
TDS bulk-load protocol instead of batched INSERTs — far faster on large/wide
|
||||
tables. No-op for non-SQL-Server dests. (migration mode only)
|
||||
- `-b` - bulk load into the destination instead of batched INSERTs — far faster
|
||||
on large/wide tables. SQL Server: TDS bulk-load via SQLServerBulkCopy.
|
||||
Postgres: COPY FROM STDIN. Other dests (e.g. DB2) fall back to INSERT.
|
||||
(migration mode only)
|
||||
- `-f` - output format: csv, tsv (query mode only, default: csv)
|
||||
|
||||
Loading…
Reference in New Issue
Block a user