diff --git a/pf_spec.md b/pf_spec.md new file mode 100644 index 0000000..c7a2db2 --- /dev/null +++ b/pf_spec.md @@ -0,0 +1,538 @@ +# Pivot Forecast — Application Spec + +## Overview + +A web application for building named forecast scenarios against any PostgreSQL table. An admin configures a source table, generates a baseline, and opens it for users to make adjustments. Users interact with a pivot table to select slices of data and apply forecast operations. All changes are incremental (append-only), fully audited, and reversible. + +--- + +## Tech Stack + +- **Backend:** Node.js / Express +- **Database:** PostgreSQL — isolated `pf` schema, installs into any existing DB +- **Frontend:** Vanilla JS + AG Grid (pivot mode) +- **Pattern:** Follows fc_webapp (shell) + pivot_forecast (operations) + +--- + +## Database Schema: `pf` + +Everything lives in the `pf` schema. Install via sequential SQL scripts. + +### `pf.source` +Registered source tables available for forecasting. + +```sql +CREATE TABLE pf.source ( + id serial PRIMARY KEY, + schema text NOT NULL, + tname text NOT NULL, + label text, -- friendly display name + status text DEFAULT 'active', -- active | archived + created_at timestamptz DEFAULT now(), + created_by text, + UNIQUE (schema, tname) +); +``` + +### `pf.col_meta` +Column configuration for each registered source table. Determines how the app treats each column. + +```sql +CREATE TABLE pf.col_meta ( + id serial PRIMARY KEY, + source_id integer REFERENCES pf.source(id), + cname text NOT NULL, -- column name in source table + label text, -- friendly display name + role text NOT NULL, -- 'dimension' | 'value' | 'units' | 'date' | 'ignore' + is_key boolean DEFAULT false, -- true = part of natural key (used in WHERE slice) + opos integer, -- ordinal position (for ordering) + UNIQUE (source_id, cname) +); +``` + +**Roles:** +- `dimension` — categorical field (customer, part, channel, rep, geography, etc.) — appears as pivot rows/cols, used in WHERE filters +- `value` — the money/revenue field to scale +- `units` — the quantity field to scale +- `date` — the date field used for baseline date range selection +- `ignore` — exclude from forecast table + +### `pf.version` +Named forecast scenarios. One forecast table (`pf.fc_{tname}_{version_id}`) is created per version. + +```sql +CREATE TABLE pf.version ( + id serial PRIMARY KEY, + source_id integer REFERENCES pf.source(id), + name text NOT NULL, + description text, + status text DEFAULT 'open', -- open | closed + exclude_iters jsonb DEFAULT '["reference"]', -- iter values excluded from all operations + created_at timestamptz DEFAULT now(), + created_by text, + closed_at timestamptz, + closed_by text, + UNIQUE (source_id, name) +); +``` + +**`exclude_iters`:** jsonb array of `iter` values that are excluded from operation WHERE clauses. Defaults to `["reference"]`. Reference rows are still returned by `get_data` (visible in pivot) but are never touched by scale/recode/clone. Additional iters can be added to lock them from further adjustment. + +**Forecast table naming:** `pf.fc_{tname}_{version_id}` — e.g., `pf.fc_sales_3`. One table per version, physically isolated. Contains both operational rows and reference rows. + +Creating a version → `CREATE TABLE pf.fc_{tname}_{version_id} (...)` +Deleting a version → `DROP TABLE pf.fc_{tname}_{version_id}` + delete from `pf.version` + delete from `pf.log` + +### `pf.log` +Audit log. Every write operation gets one entry here. + +```sql +CREATE TABLE pf.log ( + id bigserial PRIMARY KEY, + version_id integer REFERENCES pf.version(id), + pf_user text NOT NULL, + stamp timestamptz DEFAULT now(), + operation text NOT NULL, -- 'baseline' | 'reference' | 'scale' | 'recode' | 'clone' + slice jsonb, -- the WHERE conditions that defined the selection + params jsonb, -- operation parameters (increments, new values, scale factor, etc.) + note text -- user-provided comment +); +``` + +### `pf.fc_{tname}_{version_id}` (dynamic, one per version) +Created when a version is created. Mirrors source table dimension/value/units/date columns plus forecast metadata. Contains both operational rows (`iter = 'baseline' | 'scale' | 'recode' | 'clone'`) and reference rows (`iter = 'reference'`). + +```sql +-- Example: source table "sales", version id 3 → pf.fc_sales_3 +CREATE TABLE pf.fc_sales_3 ( + id bigserial PRIMARY KEY, + + -- mirrored from source (role = dimension | value | units | date only): + customer text, + channel text, + part text, + geography text, + order_date date, + units numeric, + value numeric, + + -- forecast metadata: + iter text, -- 'baseline' | 'reference' | 'scale' | 'recode' | 'clone' + logid bigint REFERENCES pf.log(id), + pf_user text, + created_at timestamptz DEFAULT now() +); +``` + +Note: no `version_id` column on the forecast table — it's implied by the table itself. + +### `pf.sql` +Generated SQL stored per source and operation. Built once when col_meta is finalized, fetched at request time. + +```sql +CREATE TABLE pf.sql ( + id serial PRIMARY KEY, + source_id integer REFERENCES pf.source(id), + operation text NOT NULL, -- 'baseline' | 'reference' | 'scale' | 'recode' | 'clone' | 'get_data' | 'undo' + sql text NOT NULL, + generated_at timestamptz DEFAULT now(), + UNIQUE (source_id, operation) +); +``` + +**Column names are baked in at generation time.** Runtime substitution tokens: + +| Token | Resolved from | +|-------|--------------| +| `{{fc_table}}` | `pf.fc_{tname}_{version_id}` — derived at request time | +| `{{where_clause}}` | built from `slice` JSON by `build_where()` in JS | +| `{{exclude_clause}}` | built from `version.exclude_iters` — e.g. `AND iter NOT IN ('reference')` | +| `{{logid}}` | newly inserted `pf.log` id | +| `{{pf_user}}` | from request body | +| `{{date_from}}` / `{{date_to}}` | baseline/reference date range | +| `{{value_incr}}` / `{{units_incr}}` | scale operation increments | +| `{{pct}}` | scale mode: absolute or percentage | +| `{{set_clause}}` | recode/clone dimension overrides | +| `{{scale_factor}}` | clone multiplier | + +**Request-time flow:** +1. Fetch SQL from `pf.sql` for `source_id` + `operation` +2. Fetch `version.exclude_iters`, build `{{exclude_clause}}` +3. Build `{{where_clause}}` from `slice` JSON via `build_where()` +4. Substitute all tokens +5. Execute — single round trip + +**WHERE clause safety:** `build_where()` validates every key in the slice against col_meta (only `role = 'dimension'` columns are permitted). Values are sanitized (escaped single quotes). No parameterization — consistent with existing projects, debuggable in Postgres logs. + +--- + +## Setup / Install Scripts + +``` +setup_sql/ + 01_schema.sql -- CREATE SCHEMA pf; create all metadata tables (source, col_meta, version, log, sql) +``` + +Source registration, col_meta configuration, SQL generation, version creation, and forecast table DDL all happen via API. + +--- + +## API Routes + +### DB Browser + +| Method | Route | Description | +|--------|-------|-------------| +| GET | `/api/tables` | List all tables in the DB with row counts | +| GET | `/api/tables/:schema/:tname/preview` | Preview columns + sample rows | + +### Source Management + +| Method | Route | Description | +|--------|-------|-------------| +| GET | `/api/sources` | List registered sources | +| POST | `/api/sources` | Register a source table | +| GET | `/api/sources/:id/cols` | Get col_meta for a source | +| PUT | `/api/sources/:id/cols` | Save col_meta configuration | +| POST | `/api/sources/:id/generate-sql` | Generate/regenerate all operation SQL into `pf.sql` | +| GET | `/api/sources/:id/sql` | View generated SQL for a source (inspection/debug) | +| DELETE | `/api/sources/:id` | Deregister a source (does not affect existing forecast tables) | + +### Forecast Versions + +| Method | Route | Description | +|--------|-------|-------------| +| GET | `/api/sources/:id/versions` | List versions for a source | +| POST | `/api/sources/:id/versions` | Create a new version (CREATE TABLE for forecast table) | +| PUT | `/api/versions/:id` | Update version (name, description, exclude_iters) | +| POST | `/api/versions/:id/close` | Close a version (blocks further edits) | +| POST | `/api/versions/:id/reopen` | Reopen a closed version | +| DELETE | `/api/versions/:id` | Delete a version (DROP TABLE + delete log entries) | + +### Baseline & Reference Data + +| Method | Route | Description | +|--------|-------|-------------| +| POST | `/api/versions/:id/baseline` | Load baseline from source table for a date range | +| POST | `/api/versions/:id/reference` | Load reference rows from source table for a date range | + +**Baseline request body:** +```json +{ + "date_from": "2024-01-01", + "date_to": "2024-12-31", + "pf_user": "admin", + "note": "restated actuals", + "replay": false +} +``` + +`replay` controls behavior when incremental rows exist: + +- `replay: false` (default) — delete existing `iter = 'baseline'` rows only, re-insert new baseline, leave all incremental rows (`scale`, `recode`, `clone`) untouched +- `replay: true` — delete all rows, re-insert new baseline, then re-execute each log entry in chronological order against the new baseline, reconstructing all adjustments + +The UI presents this as a choice when the admin re-baselines and incremental rows exist: +> "This version has N adjustments. Rebuild baseline only, or replay all adjustments against the new baseline?" + +**v1 note:** `replay: true` returns `501 Not Implemented` until the replay engine is built. The flag is designed into the API now so the request shape doesn't change later. + +**Reference request body:** same shape without `replay`. Reference loads are additive — multiple reference periods can be loaded independently under separate log entries. Each is undoable via its logid. + +### Forecast Data + +| Method | Route | Description | +|--------|-------|-------------| +| GET | `/api/versions/:id/data` | Return all rows for this version (all iters including reference) | + +Returns flat array. AG Grid pivot runs client-side on this data. + +### Forecast Operations + +All operations share a common request envelope: + +```json +{ + "pf_user": "paul.trowbridge", + "note": "optional comment", + "slice": { + "channel": "WHS", + "geography": "WEST" + } +} +``` + +`slice` keys must be `role = 'dimension'` columns per col_meta. Stored in `pf.log` as the implicit link to affected rows. + +#### Scale +`POST /api/versions/:id/scale` + +```json +{ + "pf_user": "paul.trowbridge", + "note": "10% volume lift Q3 West", + "slice": { "channel": "WHS", "geography": "WEST" }, + "value_incr": null, + "units_incr": 5000, + "pct": false +} +``` + +- `value_incr` / `units_incr` — absolute amounts to add (positive or negative). Either can be null. +- `pct: true` — treat as percentage of current slice total instead of absolute +- Excludes `exclude_iters` rows from the source selection +- Distributes increment proportionally across rows in the slice +- Inserts rows tagged `iter = 'scale'` + +#### Recode +`POST /api/versions/:id/recode` + +```json +{ + "pf_user": "paul.trowbridge", + "note": "Part discontinued, replaced by new SKU", + "slice": { "part": "OLD-SKU-001" }, + "set": { "part": "NEW-SKU-002" } +} +``` + +- `set` — one or more dimension fields to replace (can swap multiple at once) +- Inserts negative rows to zero out the original slice +- Inserts positive rows with replaced dimension values +- Both sets of rows share the same `logid` — undone together +- Inserts rows tagged `iter = 'recode'` + +#### Clone +`POST /api/versions/:id/clone` + +```json +{ + "pf_user": "paul.trowbridge", + "note": "New customer win, similar profile to existing", + "slice": { "customer": "EXISTING CO", "channel": "DIR" }, + "set": { "customer": "NEW CO" }, + "scale": 0.75 +} +``` + +- `set` — dimension values to override on cloned rows +- `scale` — optional multiplier on value/units (default 1.0) +- Does not offset original slice +- Inserts rows tagged `iter = 'clone'` + +### Audit & Undo + +| Method | Route | Description | +|--------|-------|-------------| +| GET | `/api/versions/:id/log` | List all log entries for a version, newest first | +| DELETE | `/api/log/:logid` | Undo: delete all forecast rows with this logid, then delete log entry | + +--- + +## Frontend (Web UI) + +### Navigation (sidebar) + +1. **Sources** — browse DB tables, register sources, configure col_meta, generate SQL +2. **Versions** — list forecast versions per source, create/close/reopen/delete +3. **Forecast** — main working view (pivot + operation panel) +4. **Log** — change history with undo + +### Sources View + +- Left: DB table browser (like fc_webapp) — all tables with row counts, preview on click +- Right: Registered sources list — click to open col_meta editor +- Col_meta editor: AG Grid editable table — set role per column, toggle is_key, set label +- "Generate SQL" button — triggers generate-sql route, shows confirmation +- Must generate SQL before versions can be created against this source + +### Versions View + +- List of versions for selected source — name, status (open/closed), created date, row count +- Create version form — name, description, exclude_iters (defaults to `["reference"]`) +- Per-version actions: open forecast, load baseline, load reference, close, reopen, delete + +### Forecast View + +**Layout:** +``` +┌──────────────────────────────────────────────────────────┐ +│ [Source: sales] [Version: FY2024 v1 — open] [Refresh] │ +├────────────────────────┬─────────────────────────────────┤ +│ │ │ +│ Pivot Grid │ Operation Panel │ +│ (AG Grid pivot mode) │ (active when slice selected) │ +│ │ │ +│ │ Slice: │ +│ │ channel = WHS │ +│ │ geography = WEST │ +│ │ │ +│ │ [ Scale ] [ Recode ] [ Clone ] │ +│ │ │ +│ │ ... operation form ... │ +│ │ │ +│ │ [ Submit ] │ +│ │ │ +└────────────────────────┴─────────────────────────────────┘ +``` + +**Interaction flow:** +1. Select cells in pivot — selected dimension values populate Operation Panel as slice +2. Pick operation tab, fill in parameters +3. Submit → POST to API → response shows rows affected +4. Grid refreshes (re-fetch `get_data`) + +**Reference rows** shown in pivot (for context) but visually distinguished (e.g., muted color). Operations never affect them. + +### Log View + +AG Grid list of log entries — user, timestamp, operation, slice, note, rows affected. +"Undo" button per row → `DELETE /api/log/:logid` → grid and pivot refresh. + +--- + +## Forecast SQL Patterns + +Column names baked in at generation time. Tokens substituted at request time. + +### Baseline / Reference Load + +```sql +WITH ilog AS ( + INSERT INTO pf.log (version_id, pf_user, operation, slice, params, note) + VALUES ({{version_id}}, '{{pf_user}}', '{{operation}}', NULL, '{{params}}'::jsonb, '{{note}}') + RETURNING id +) +INSERT INTO {{fc_table}} ( + {dimension_cols}, {value_col}, {units_col}, {date_col}, + iter, logid, pf_user, created_at +) +SELECT + {dimension_cols}, {value_col}, {units_col}, {date_col}, + '{{operation}}', (SELECT id FROM ilog), '{{pf_user}}', now() +FROM + {schema}.{tname} +WHERE + {date_col} BETWEEN '{{date_from}}' AND '{{date_to}}' +``` + +Baseline route also deletes existing `iter = 'baseline'` rows before inserting. + +### Scale + +```sql +WITH ilog AS ( + INSERT INTO pf.log (version_id, pf_user, operation, slice, params, note) + VALUES ({{version_id}}, '{{pf_user}}', 'scale', '{{slice}}'::jsonb, '{{params}}'::jsonb, '{{note}}') + RETURNING id +) +,base AS ( + SELECT + {dimension_cols}, {date_col}, + {value_col}, {units_col}, + sum({value_col}) OVER () AS total_value, + sum({units_col}) OVER () AS total_units + FROM {{fc_table}} + WHERE {{where_clause}} + {{exclude_clause}} +) +INSERT INTO {{fc_table}} ( + {dimension_cols}, {date_col}, {value_col}, {units_col}, + iter, logid, pf_user, created_at +) +SELECT + {dimension_cols}, {date_col}, + round(({value_col} / NULLIF(total_value, 0)) * {{value_incr}}, 2), + round(({units_col} / NULLIF(total_units, 0)) * {{units_incr}}, 5), + 'scale', (SELECT id FROM ilog), '{{pf_user}}', now() +FROM base +``` + +`{{value_incr}}` / `{{units_incr}}` are pre-computed in JS when `pct: true` (multiply slice total by pct). + +### Recode + +```sql +WITH ilog AS ( + INSERT INTO pf.log (version_id, pf_user, operation, slice, params, note) + VALUES ({{version_id}}, '{{pf_user}}', 'recode', '{{slice}}'::jsonb, '{{params}}'::jsonb, '{{note}}') + RETURNING id +) +,src AS ( + SELECT {dimension_cols}, {date_col}, {value_col}, {units_col} + FROM {{fc_table}} + WHERE {{where_clause}} + {{exclude_clause}} +) +,negatives AS ( + INSERT INTO {{fc_table}} ({dimension_cols}, {date_col}, {value_col}, {units_col}, iter, logid, pf_user, created_at) + SELECT {dimension_cols}, {date_col}, -{value_col}, -{units_col}, 'recode', (SELECT id FROM ilog), '{{pf_user}}', now() + FROM src +) +INSERT INTO {{fc_table}} ({dimension_cols}, {date_col}, {value_col}, {units_col}, iter, logid, pf_user, created_at) +SELECT {{set_clause}}, {date_col}, {value_col}, {units_col}, 'recode', (SELECT id FROM ilog), '{{pf_user}}', now() +FROM src +``` + +`{{set_clause}}` replaces the listed dimension columns with new values, passes others through unchanged. + +### Clone + +```sql +WITH ilog AS ( + INSERT INTO pf.log (version_id, pf_user, operation, slice, params, note) + VALUES ({{version_id}}, '{{pf_user}}', 'clone', '{{slice}}'::jsonb, '{{params}}'::jsonb, '{{note}}') + RETURNING id +) +INSERT INTO {{fc_table}} ({dimension_cols}, {date_col}, {value_col}, {units_col}, iter, logid, pf_user, created_at) +SELECT + {{set_clause}}, {date_col}, + round({value_col} * {{scale_factor}}, 2), + round({units_col} * {{scale_factor}}, 5), + 'clone', (SELECT id FROM ilog), '{{pf_user}}', now() +FROM {{fc_table}} +WHERE {{where_clause}} +{{exclude_clause}} +``` + +### Undo + +```sql +DELETE FROM {{fc_table}} WHERE logid = {{logid}}; +DELETE FROM pf.log WHERE id = {{logid}}; +``` + +--- + +## Admin Setup Flow (end-to-end) + +1. Open **Sources** view → browse DB tables → register source table +2. Open col_meta editor → assign roles to columns, mark is_key dimensions, set labels +3. Click **Generate SQL** → app writes operation SQL to `pf.sql` +4. Open **Versions** view → create a named version (sets `exclude_iters`, creates forecast table) +5. Load **Baseline** → pick date range → inserts `iter = 'baseline'` rows +6. Optionally load **Reference** → pick prior year date range → inserts `iter = 'reference'` rows +7. Open **Forecast** view → share with users + +## User Forecast Flow (end-to-end) + +1. Open **Forecast** view → select version +2. Pivot loads — explore data, identify slice to adjust +3. Select cells → Operation Panel populates with slice +4. Choose operation → fill in parameters → Submit +5. Grid refreshes — adjustment visible immediately +6. Repeat as needed +7. Admin closes version when forecasting is complete + +--- + +## Open Questions / Future Scope + +- **Baseline replay** — re-execute change log against a restated baseline (`replay: true`); v1 returns 501 +- **Timing shifts** — redistribute value/units across date buckets (deferred) +- **Approval workflow** — user submits, admin approves before changes are visible to others (deferred) +- **Territory filtering** — restrict what a user can see/edit by dimension value (deferred) +- **Export** — download forecast as CSV or push results to a reporting table +- **Version comparison** — side-by-side view of two versions (facilitated by isolated tables via UNION) +- **Multi-DB sources** — currently assumes same DB; cross-DB would need connection config per source