docs: update CLAUDE.md for PG streaming and new quoted types
Reflect the two behavior changes: (1) migration mode sets the source connection to autoCommit=false so PostgreSQL's setFetchSize actually streams (it's ignored otherwise) — and why query mode is excluded; (2) json/jsonb/ bpchar/uuid are now quoted, plus document the default-emits-unquoted gotcha for future type additions. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This commit is contained in:
parent
78c832eb1f
commit
d9fd651c72
12
CLAUDE.md
12
CLAUDE.md
@ -81,7 +81,9 @@ The tool operates in two modes:
|
||||
7. Optionally clear target table before insert if -c flag is set
|
||||
|
||||
### Type Handling
|
||||
The tool includes explicit handling for different SQL data types in a switch statement (lines 229-312). Supported types include VARCHAR, TEXT, CHAR, CLOB, DATE, TIME, TIMESTAMP, and BIGINT. String types get quote escaping and optional trimming.
|
||||
The tool includes explicit handling for different SQL data types in a switch statement (migration mode). Quoted string types: VARCHAR/NVARCHAR, TEXT/NTEXT, CHAR/NCHAR, CLOB/NCLOB, and the PostgreSQL string-ish types JSON, JSONB, BPCHAR (PG `char(n)`), and UUID. Date/time types (DATE, TIME, TIMESTAMP/DATETIME variants) are also quoted. String types get quote escaping (`'` → `''`) and optional trimming.
|
||||
|
||||
**Caveat — the `default` case emits values UNQUOTED** (correct for numerics like INT*/NUMERIC, which is why they're not listed). Any *string-typed* column whose JDBC type name isn't in the switch falls here and breaks the generated INSERT with a syntax error (e.g. PostgreSQL `bool` → `'t'`/`'f'` is currently unhandled). When adding a new source type, decide: numeric → leave to default; anything string-like → add a quoted case. A more robust future fix is to flip the default to quote-as-string with an explicit numeric allowlist.
|
||||
|
||||
### Database Drivers
|
||||
JDBC drivers are configured in `jrunner/build.gradle`:
|
||||
@ -140,10 +142,10 @@ Both modes use a streaming architecture with no array storage of result rows:
|
||||
- Only holds up to 250 rows worth of SQL text in memory at once
|
||||
|
||||
**JDBC Fetch Size:**
|
||||
- Both modes set `stmt.setFetchSize(10000)` (line 190)
|
||||
- This is a hint to the JDBC driver to fetch 10,000 rows at a time from the database
|
||||
- The driver maintains this internal buffer for network efficiency
|
||||
- The application code never sees or stores all 10,000 rows - it processes them one at a time via `rs.next()`
|
||||
- Both modes set `stmt.setFetchSize(10000)` — a hint to fetch 10,000 rows at a time
|
||||
- The application processes rows one at a time via `rs.next()`; the only buffer is the driver's fetch window
|
||||
|
||||
**⚠️ PostgreSQL requires autoCommit=false for fetchSize to take effect.** The PG JDBC driver IGNORES `setFetchSize` while autoCommit is true and instead loads the ENTIRE result set into memory (OOMs / GC-thrashes on large source tables). So in **migration mode** the source connection is set to `setAutoCommit(false)` right after connecting, which enables a server-side cursor and makes streaming actually stream. This is done **only in migration mode** — query mode leaves autoCommit at its default because callers run committed DDL/DML through query mode (e.g. external tools), and autoCommit=false would roll those statements back on connection close. (jt400/MSSQL drivers stream regardless, so only PG is affected.)
|
||||
|
||||
### Batch Size (Migration Mode)
|
||||
INSERT statements are batched at 250 rows (hardcoded around line 356). Rows are streamed into a SQL string buffer as VALUES clauses. When 250 rows accumulate in the string, it is prepended with "INSERT INTO {table} VALUES" and executed, then the string is cleared.
|
||||
|
||||
Loading…
Reference in New Issue
Block a user