jrunner prints an in-place progress counter (a bare number) every batch;
read with universal newlines, each tick became its own live_log line, so a
large load stacked thousands of numbers (a 1.27M-row INSERT left 5,073
progress lines). append_run_live_log now overwrites a trailing bare-number
line with the new tick instead of appending, keeping a single current count.
Real lines (headers, "N rows written", timestamps) aren't bare numbers and
are preserved. No jrunner change.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Pass jrunner's -b flag when the dest JDBC URL is jdbc:sqlserver:, so SQL
Server loads stream via TDS bulk copy instead of 250-row INSERT...VALUES
round trips. Non-SQL-Server dests are unchanged. Requires the jrunner -b
support (bulk-copy branch).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
wizard_create built the generated source query's column aliases with the
DEST driver's quote_identifier, but that query runs on the SOURCE. A pg->SQL
Server module emitted "AS [col]" (SQL Server brackets) into a Postgres query,
which failed with: syntax error at or near "[". The load maps columns by
position, so the alias is cosmetic — quote it with the source dialect.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Add an ASGI middleware that buffers each request body onto the scope and
replays it downstream, so the HTTPException handler can log the submitted
payload alongside the error. Fields whose name looks secret (password/pwd/
secret/token) are masked. Makes failures like the wizard 500 debuggable
against the actual call content.
Covers raised HTTPExceptions; handlers that *return* an error response are
not body-logged yet.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
run_dest_sql executes DDL via jrunner query mode (executeQuery), which demands
a ResultSet. CREATE SCHEMA/TABLE produce none, so the driver throws, jrunner
logs the trace and exits 0, and _detect_silent_failure flags it as a failure
unless the message is allowlisted. Only PG's wording was listed ("No results
were returned by the query"); SQL Server says "The statement did not return a
result set." — so pg->mssql wizard provisioning died on the first statement.
Add the SQL Server phrasing to the benign list.
Verified against the live SQL Server connection.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Failures surfaced via HTTPException (e.g. wizard dest-provisioning errors
raised as HTTPException(500, "dest provisioning failed: …")) were turned
into responses by FastAPI and never logged — only the access line showed,
so the real DB error went to the browser and vanished from the journal.
Register a StarletteHTTPException handler that logs 5xx at ERROR (with
exc_info, capturing the chained cause) and 4xx at WARNING, then defers to
the default handler. Also configure pipekit's logger to emit to stderr so
INFO-level records aren't dropped by uvicorn's last-resort handler.
Unhandled (non-HTTPException) errors were already logged by uvicorn.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Modules' column listings were write-once at wizard time — no way to add a
column to an established sync (e.g. an RRN watermark column on a history
table) without hand-editing columns_json and ALTERing the dest by hand.
Phase 0 (groundwork):
- columns_json rows get a stable `id` (c1, c2, …) — the data-movement
identity for future schema reconciliation (the load is positional).
- repo.update_module_columns to persist the listing.
- Driver.build_add_column_sql + Driver.column_inventory.
Phase 1 (append a column):
- "+ add column" on the module detail page -> column_form.html.
- POST /modules/{id}/columns: validates the name isn't already in the
listing or on the table, runs ALTER TABLE … ADD COLUMN (appends at the
tail, rows preserved), applies COMMENT ON COLUMN where supported, and
appends to columns_json. Re-renders with an error on conflict/DDL failure.
Append-only and non-destructive; reorder/retype/drop (which can require a
table rebuild) are out of scope for this phase. Verified end-to-end against
the live PG dest on a throwaway module.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The wizard previously required picking a single source table; modules whose
entry point is arbitrary SQL (CTEs, joins, computed columns) didn't fit. Add a
"write SQL" path alongside "browse a table":
- Driver.introspect_query_columns + _zero_row_wrap discover a query's result
columns by running it with ~no rows. Generic wrap is a derived table with
WHERE 1=0; DB2 appends FETCH FIRST 1 ROW ONLY (DB2 for i forbids WITH inside
a nested table expression).
- /wizard/sql + POST /wizard/sql/columns seed the column-mapping grid; dest
types default to text (no result-set type metadata over jrunner CSV).
- wizard_step3.html grows a sql_mode branch (array-named inputs, query shown
verbatim, no column unchecking); wizard_create branches on entry_mode.
Verified end-to-end against a live DB2 for i connection, including a top-level
CTE query.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Add dialect-aware DDL hooks to the Driver base (create_schema_sql,
drop_table_if_exists_sql, create_like_table_sql, check_dest_table) and
implement DB2/MSSQL overrides so they can serve as merge destinations,
not just Postgres. runner.py now dispatches staging table creation
through the dest driver instead of hardcoding PG syntax.
Also untrack pipekit.db (runtime SQLite state) and add it to .gitignore.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
- deploy.sh: set /etc/pipekit to root:pipekit 0775 and secrets.env to
pipekit:pipekit 0640 so group members can run 'pipekit secrets set'
without sudo
- cli.py secrets set: drop os.chown() on temp file — non-root users
can't chown to the pipekit service user, and os.replace() preserves
the target file's ownership anyway
Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
- New pipekit/web/auth.py: itsdangerous-signed cookie, 8hr expiry,
auto-generates signing secret in settings table on first use
- GET/POST /login and POST /logout routes (public, no auth dependency)
- All other web routes protected via router-level require_web_auth dep
- Starlette middleware injects request.state.current_user for templates
- Topbar shows logged-in username + logout button when session active
- Reuses existing api_user/api_pass credentials and api_auth_enabled flag
- Add itsdangerous>=2.1 to requirements.txt
- Enable api_auth_enabled in config.yaml
Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
- Size all table columns to fit content (em-based) rather than loose percentages
- Add white-space:nowrap to groups header and last-run cells
- Sticky topbar and panel header so New Module button stays visible while scrolling
- Scope min-width:0 to label.field inputs so they don't blow past two-col grid borders
Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
- Convert all timestamps to local time for display and scheduling
- deploy.sh: detect JAVA_HOME and inject into systemd unit at deploy time
- repo: add duration_s to get_group_run query
The group run detail page was crashing because get_group_run returned no
duration_s field, unlike the list queries. Fixes 500 on /group-runs/{id}.
Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
The pipekit system user has no PATH to java. deploy.sh now detects
JAVA_HOME by searching common locations and injects Environment= lines
into the installed unit file, making deploys portable across machines.
Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
Scheduler now evaluates cron expressions against local time instead of
UTC, so schedules fire at the user's local clock time. All timestamp
displays in templates use a new `localtime` Jinja filter that converts
UTC strings from SQLite to the server's local timezone. Updated CLAUDE.md
to reflect the systemd service setup.
Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
SQLite needs write access to the repo directory to create journal files
alongside pipekit.db. Fixed by setting group pipekit + g+w on the
directory itself only (not recursive).
Driver registration now matches existing rows by kind before falling
back to name, so re-deploys update the correct row regardless of what
name was used at initial registration.
Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
Avoids stripping write access from the developer. The service only needs
to own pipekit.db (runtime writes) and .venv (created as pipekit).
Source code stays owned by whoever ran deploy.sh.
Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
deploy.sh now prints each step with what it's doing, adds the invoking
user to the pipekit group automatically, uses --home-dir /nonexistent
for the system user, and passes --no-cache-dir to pip to suppress the
home directory warning.
cli.py: removed the kind-based early-exit in drivers register that was
short-circuiting before the upsert logic, so re-running deploy now
correctly updates existing driver rows rather than printing "already
registered". Also removed the now-unused --force flag.
Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
Scheduling: cron-based group runs via a daemon thread (scheduler.py)
started at API startup. Schedules managed inline on the group edit form.
last_fired_at persisted before run to prevent double-fire on restart.
Requires croniter (added to requirements.txt); DB migration adds
last_fired_at column to schedule table.
Deploy: deploy.sh now creates the pipekit system user, chowns the repo,
builds the venv as pipekit, and installs/enables the systemd unit.
systemd/pipekit.service is now a production-ready unit (User=pipekit
uncommented). pipekit secrets set preserves existing file permissions
instead of resetting to 0600. Driver registration is now idempotent
(upsert via get_driver_by_name + update_driver).
Docs: CLAUDE.md and SPEC.md updated to reflect groups, scheduling,
scheduler-in-API-process architecture, TUI deferred (not dropped),
stop-on-failure tradeoff, jrunner as prerequisite, and deploy flow.
Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
Groups allow multiple modules to be run sequentially in a defined order.
Adds full CRUD (repo, engine orchestrator, web routes, templates) for grp,
group_member, and group_run tables that were previously schema-only. Module
index now shows group membership badges per module. Wizard default dest name
now sanitizes source column names with spaces or special characters to valid
identifiers rather than failing at CREATE TABLE time.
Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
Module detail and index Run/Dry run buttons no longer redirect to the
run page. The status cell (index) and recent runs panel (detail) poll
every 3s while running and stop automatically when idle. force_poll
ensures polling starts immediately after clicking Run despite the race
between the HTTP response and the background task setting running=1.
Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
Supports iSeries schema names that contain dots (e.g. CMS.CUSLG).
Strips surrounding double quotes on input so users don't need to worry
about quoting — the driver's quote_identifier handles that when building SQL.
Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
create_run now sets started_at on INSERT. list_runs computes duration_s
via julianday arithmetic. Both the module detail and runs page show
duration formatted as Xs or Xm Ys. A Jinja2 filter handles formatting.
Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
Watermarks, merge strategy, merge key, and source query are now edited
together in one form on both the module edit page and wizard step 3.
A client-side placeholder warning fires when {name} tokens in the query
don't match the watermark rows on the page. The wizard now shows an
editable source query textarea pre-populated from column picks so WHERE
clauses can be added before module creation. Watermarks submitted via
wm_* arrays are processed by _save_inline_watermarks() in both
module_update and wizard_create.
Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
Web POST /modules/{id}/run now returns immediately (BackgroundTasks)
instead of blocking until the run completes. jrunner.migrate() switches
from subprocess.run to Popen so stdout lines are read as they arrive and
appended to run_log.live_log via repo.append_run_live_log(). The run
detail page embeds an HTMX fragment that polls /runs/{id}/live every 2s
while status=running, showing current status, row count, and live output;
polling stops automatically once the run finishes.
Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
- Wire watermark WHERE clause into GL20000 source query ({dex_row_id} placeholder was present but query had no WHERE clause)
- Fix watermark resolver connection for GL20000 (was pointing at AS400, should be postgres dest)
- Resolve watermarks live on dry runs and module detail page load instead of using defaults
- Use status='dry_run' (not 'success') for dry runs so they can be filtered from recent runs UI
- Add exclude_status param to repo.list_runs; module detail excludes dry_run rows
- Expand run_log CHECK constraint to include 'dry_run'; backfill 16 historical records
- Delete SPEC_v1_archive.md (obsolete v1 design doc)
- Update SPEC.md and CLAUDE.md to reflect current engine flow and status values
Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
Adds a /wizard/schemas JSON endpoint and a live-filtered schema picker
panel on step 2. Clicking a row fills the schema input; the datalist
also powers browser autocomplete. MSSQL refetches when database or
linked_server qualifiers change. CSS fixes prevent picker tables and
two-col grid items from overflowing their containers.
Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
Base provides a no-op default; drivers opt in by overriding. MSSQL
scopes the lookup to a linked server / database when those qualifiers
are supplied.
Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
Previously the existing-dest check fired on submit and surfaced as a raw
JSON 400. Now step 3 introspects the default dest up front and renders a
yellow banner listing existing columns; submit-time mismatches render
wizard_error.html (409) with missing vs. existing side-by-side and a back
link that re-plumbs the form qvals.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
New text input + "jump to columns" button skip the full table listing
when you already know what you want. Typing "schema.table" and tabbing
out auto-splits into the schema qualifier + table name. Jump button
stays disabled until the table field has a value.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
If the dest table already exists, introspect its columns and verify the
wizard's picks line up. Missing columns surface a specific error message
naming what's missing instead of the opaque "column X does not exist"
from a failed COMMENT. On match, skip CREATE + COMMENT so existing
schema and comments aren't touched; staging still gets provisioned.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The tracked launcher now checks for .venv/bin/python3 under the repo and
uses it if present, else falls back to system python3. Works pre-deploy
(no venv) and post-deploy (venv exists) without being modified. Deploy
no longer regenerates the file, so `git pull` on a deployed box won't
conflict with the launcher.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
cmd_serve now reads api_host from Config with a 127.0.0.1 safe default,
matching the existing api_port pattern. --host/--port CLI flags still
override. Local config is bumped to bind 0.0.0.0:8200.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Without -H, sudo keeps HOME pointed at the invoking user, so pip running
as root tries to write to /home/<user>/.cache/pip and disables caching
with a warning. -H resets HOME to /root while -E preserves the rest.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
deploy.sh is the idempotent rollout path: venv + deps, launcher,
/etc/pipekit/secrets.env skeleton (mode 0600), schema init, and
auto-register of every JDBC driver shipped with jrunner. systemd
unit is a template, not auto-installed — user copies it when ready
to cut over.
`pipekit secrets {list,set,unset}` manages /etc/pipekit/secrets.env
with atomic 0600 writes so passwords don't need sudoedit. Prompted
input by default; positional value allowed for scripting.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Registers a driver-table row from the CLI. Kind is validated against
the code-level driver registry; JDBC class names default from a
built-in table (db2, pg, mssql). Refuses to double-register a kind
unless --force is passed.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Staging table drift caused silent data loss when dest grew columns but
staging kept the old shape. Fix on three fronts:
- Runner now DROP+CREATEs staging each run instead of CREATE IF NOT
EXISTS, so any drift self-heals.
- Wizard create drop+creates staging right after dest is provisioned,
surfacing DDL errors at create time.
- Module edit drops the (old-name) staging table and re-applies
COMMENT ON TABLE when dest_description changed.
jrunner's query mode uses executeQuery() which raises
"No results were returned by the query" after DDL/DML succeeds; the
stack-trace detector now allowlists that exception so normal
CREATE/TRUNCATE/INSERT runs aren't flagged as failures.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Modules get a full edit form (name, connections, tables, source query,
merge config, description, enabled); reachable via Edit button on the
detail page and the source-query panel.
jrunner catches SQLException and calls System.exit(0) at every failure
site, so pipekit was marking runs success when the migrate phase had
actually errored. query() and migrate() now scan stdout+stderr for a
Java stack-trace signature and raise JrunnerError. runner.py also
captures the failed jrunner output onto run_log so the stack trace is
visible on the run detail page.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Delete button lives in module-detail header, refuses to delete a
running module, and clears run_log history first since it doesn't
cascade from module. Wizard now returns 409 on duplicate name before
touching the destination, so a failed resubmit doesn't redundantly
rerun CREATE TABLE / COMMENT ON on the dest.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Orchestration layer around the jrunner Java JDBC CLI, replacing the
previous shell-based sync system in .archive/pre-rewrite. Includes
the FastAPI + Jinja web frontend, per-driver adapters (DB2, MSSQL,
PG), wizard-driven module creation with editable dest types and
source-sourced table/column descriptions, watermark/hook CRUD,
and the engine that runs modules end-to-end.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>