pf_app/pf_spec.md
Paul Trowbridge 101cb27604 Update CLAUDE.md and spec: units optional, dim_group/dim_period, delete todo.md
- units role is now optional; spec and CLAUDE.md reflect conditionality in SQL patterns
- pf.col_meta gains dim_group and dim_period_col fields (documented in both files)
- pf.dim_period calendar table added to schema docs
- pf.source default_layout column added to spec DDL
- Forecast table metadata columns corrected to pf_iter/pf_logid/pf_created_at throughout spec
- SQL patterns updated with correct CTE structure and RETURNING * to match generated code
- Project status updated to 2026-06-12; stale Arrow IPC open question removed
- todo.md deleted; open items retained in CLAUDE.md known issues

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-12 23:27:05 -04:00

42 KiB
Raw Blame History

Pivot Forecast — Application Spec

Overview

A web application for building named forecast scenarios against any PostgreSQL table. The core workflow is: load known historical actuals as a baseline, shift those dates forward by a specified interval into the forecast period to establish a no-change starting point, then apply incremental adjustments (scale, recode, clone) to build the plan. An admin configures a source table, generates a baseline, and opens it for users to make adjustments. Users interact with a pivot table to select slices of data and apply forecast operations. All changes are incremental (append-only), fully audited, and reversible.


Tech Stack

  • Backend: Node.js / Express
  • Database: PostgreSQL — isolated pf schema, installs into any existing DB
  • Frontend: React + Vite + Tailwind CSS; Perspective (forecast pivot)
  • Pattern: Follows fc_webapp (shell) + pivot_forecast (operations)

Database Schema: pf

Everything lives in the pf schema. Install via sequential SQL scripts.

pf.source

Registered source tables available for forecasting.

CREATE TABLE pf.source (
    id              serial PRIMARY KEY,
    schema          text NOT NULL,
    tname           text NOT NULL,
    label           text,                   -- friendly display name
    status          text DEFAULT 'active',  -- active | archived
    default_layout  jsonb,                  -- Perspective view config used as per-source default
    created_at      timestamptz DEFAULT now(),
    created_by      text,
    UNIQUE (schema, tname)
);

pf.col_meta

Column configuration for each registered source table. Determines how the app treats each column.

CREATE TABLE pf.col_meta (
    id              serial PRIMARY KEY,
    source_id       integer REFERENCES pf.source(id),
    cname           text NOT NULL,          -- column name in source table
    label           text,                   -- friendly display name
    role            text NOT NULL,          -- 'dimension' | 'value' | 'units' | 'date' | 'filter' | 'ignore'
    is_key          boolean DEFAULT false,  -- true = part of natural key (used in WHERE slice)
    opos            integer,                -- ordinal position (for ordering)
    dim_group       text,                   -- groups functionally dependent columns (see below)
    dim_period_col  text,                   -- maps this dimension to a pf.dim_period column
    UNIQUE (source_id, cname)
);

Roles:

  • dimension — categorical field (customer, part, channel, rep, geography, etc.) — appears as pivot rows/cols, used in WHERE filters for operations
  • value — the money/revenue field to scale (required — SQL generation fails without it)
  • units — the quantity field to scale (optional — if absent, units columns are omitted from the forecast table and all SQL patterns)
  • date — the primary date field; used for baseline/reference date range and stored in the forecast table (required)
  • filter — columns available as filter conditions in the Baseline Workbench (e.g. order status, ship date, open flag); used in baseline WHERE clauses but not stored in the forecast table
  • ignore — exclude from forecast table entirely

dim_group — a free-text group name linking a date column to its derived dimension siblings. When the date column has is_key = true and a dim_group value, the SQL generator looks for dimension columns in the same group that also have a dim_period_col value. Those columns are sourced from pf.dim_period on baseline/reference load (via a JOIN on drange @> date) rather than copied raw from the source table. This allows fiscal year, quarter, and month columns to be stored in the forecast table with calendar-correct values even if those columns don't exist in the source.

dim_period_col — names the column in pf.dim_period to use as the value for this dimension on load. Only meaningful when the column is in a dim_group whose date key has is_key = true. Example: cal_year, fisc_quarter, fisc_label.

pf.dim_period

Calendar lookup table. One row per month from 2018-01-01 through 2035-12-01. Keyed on sdat (month start date). Used to derive fiscal/calendar period columns at baseline load time when dim_group / dim_period_col are configured on col_meta.

Populated by setup_sql/gen_dim_period.sql (safe to re-run; ON CONFLICT DO NOTHING). Fiscal year start month is configurable at the top of that script (default: June, i.e. fiscal month 1 = June).

Key columns: sdat, edat, drange (GiST-indexed daterange), cal_year, cal_quarter, cal_month, cal_month_abbr, cal_month_name, cal_label, fisc_year, fisc_quarter, fisc_quarter_label, fisc_month, fisc_month_abbr, fisc_month_name, fisc_label, period_key.

The baseline/reference SQL JOINs this table when hasDimPeriod is true: JOIN pf.dim_period dp ON dp.drange @> (s.{date_col} + '{{date_offset}}'::interval)::date.

pf.version

Named forecast scenarios. One forecast table (pf.fc_{tname}_{version_id}) is created per version.

CREATE TABLE pf.version (
    id              serial PRIMARY KEY,
    source_id       integer REFERENCES pf.source(id),
    name            text NOT NULL,
    description     text,
    status          text DEFAULT 'open',        -- open | closed
    exclude_iters   jsonb DEFAULT '["reference"]', -- iter values excluded from all operations
    created_at      timestamptz DEFAULT now(),
    created_by      text,
    closed_at       timestamptz,
    closed_by       text,
    UNIQUE (source_id, name)
);

exclude_iters: jsonb array of iter values that are excluded from operation WHERE clauses. Defaults to ["reference"]. Reference rows are still returned by get_data (visible in pivot) but are never touched by scale/recode/clone. Additional iters can be added to lock them from further adjustment.

Forecast table naming: pf.fc_{tname}_{version_id} — e.g., pf.fc_sales_3. One table per version, physically isolated. Contains both operational rows and reference rows.

Creating a version → CREATE TABLE pf.fc_{tname}_{version_id} (...) Deleting a version → DROP TABLE pf.fc_{tname}_{version_id} + delete from pf.version + delete from pf.log

pf.log

Audit log. Every write operation gets one entry here.

CREATE TABLE pf.log (
    id          bigserial PRIMARY KEY,
    version_id  integer REFERENCES pf.version(id),
    pf_user     text NOT NULL,
    stamp       timestamptz DEFAULT now(),
    operation   text NOT NULL,  -- 'baseline' | 'reference' | 'scale' | 'recode' | 'clone'
    slice       jsonb,          -- the WHERE conditions that defined the selection
    params      jsonb,          -- operation parameters (increments, new values, scale factor, etc.)
    note        text            -- user-provided comment
);

pf.fc_{tname}_{version_id} (dynamic, one per version)

Created when a version is created. Mirrors source table dimension/value/date columns (and units if configured) plus any dim_period_col-derived dimension columns, plus forecast metadata. Contains both operational rows (pf_iter = 'baseline' | 'scale' | 'recode' | 'clone') and reference rows (pf_iter = 'reference').

-- Example: source table "sales", version id 3 → pf.fc_sales_3
CREATE TABLE pf.fc_sales_3 (
    id              bigserial PRIMARY KEY,

    -- mirrored from source (role = dimension | value | units | date only):
    customer        text,
    channel         text,
    part            text,
    geography       text,
    order_date      date,
    value           numeric,
    units           numeric,    -- omitted if no 'units' role in col_meta

    -- forecast metadata:
    pf_iter         text,       -- 'baseline' | 'reference' | 'scale' | 'recode' | 'clone'
    pf_logid        bigint REFERENCES pf.log(id),
    pf_user         text,
    pf_created_at   timestamptz DEFAULT now()
);

Note: no version_id column on the forecast table — it's implied by the table itself. The units column is only present when a column with role = 'units' exists in col_meta.

pf.sql

Generated SQL stored per source and operation. Built once when col_meta is finalized, fetched at request time.

CREATE TABLE pf.sql (
    id           serial PRIMARY KEY,
    source_id    integer REFERENCES pf.source(id),
    operation    text NOT NULL,  -- 'baseline' | 'reference' | 'scale' | 'recode' | 'clone' | 'get_data' | 'undo'
    sql          text NOT NULL,
    generated_at timestamptz DEFAULT now(),
    UNIQUE (source_id, operation)
);

Column names are baked in at generation time. Runtime substitution tokens:

Token Resolved from
{{fc_table}} pf.fc_{tname}_{version_id} — derived at request time
{{where_clause}} built from slice JSON by build_where() in JS
{{exclude_clause}} built from version.exclude_iters — e.g. AND pf_iter NOT IN ('reference')
{{logid}} newly inserted pf.log id
{{pf_user}} from request body
{{date_from}} / {{date_to}} baseline/reference date range (source period)
{{date_offset}} PostgreSQL interval string to shift dates into the forecast period — e.g. 1 year, 6 months, 2 years 3 months (baseline only; empty string = no shift)
{{value_incr}} / {{units_incr}} scale operation increments
{{pct}} scale mode: absolute or percentage
{{set_clause}} recode/clone dimension overrides
{{scale_factor}} clone multiplier

Request-time flow:

  1. Fetch SQL from pf.sql for source_id + operation
  2. Fetch version.exclude_iters, build {{exclude_clause}}
  3. Build {{where_clause}} from slice JSON via build_where()
  4. Substitute all tokens
  5. Execute — single round trip

WHERE clause safety: build_where() validates every key in the slice against col_meta (only role = 'dimension' columns are permitted). Values are sanitized (escaped single quotes). No parameterization — consistent with existing projects, debuggable in Postgres logs.


Setup / Install Scripts

setup_sql/
  01_schema.sql   -- CREATE SCHEMA pf; create all metadata tables (source, col_meta, version, log, sql)

Source registration, col_meta configuration, SQL generation, version creation, and forecast table DDL all happen via API.


API Routes

DB Browser

Method Route Description
GET /api/tables List all tables in the DB with row counts
GET /api/tables/:schema/:tname/preview Preview columns + sample rows

Source Management

Method Route Description
GET /api/sources List registered sources
POST /api/sources Register a source table
GET /api/sources/:id/cols Get col_meta for a source
PUT /api/sources/:id/cols Save col_meta configuration
POST /api/sources/:id/generate-sql Generate/regenerate all operation SQL into pf.sql
GET /api/sources/:id/sql View generated SQL for a source (inspection/debug)
DELETE /api/sources/:id Deregister a source (does not affect existing forecast tables)

Forecast Versions

Method Route Description
GET /api/sources/:id/versions List versions for a source
POST /api/sources/:id/versions Create a new version (CREATE TABLE for forecast table)
PUT /api/versions/:id Update version (name, description, exclude_iters)
POST /api/versions/:id/close Close a version (blocks further edits)
POST /api/versions/:id/reopen Reopen a closed version
DELETE /api/versions/:id Delete a version (DROP TABLE + delete log entries)

Baseline & Reference Data

Method Route Description
POST /api/versions/:id/baseline Load one baseline segment (additive — does not clear existing baseline rows)
DELETE /api/versions/:id/baseline Clear all baseline rows and baseline log entries for this version
POST /api/versions/:id/reference Load reference rows from source table for a date range (additive)

Baseline load request body:

{
  "date_offset":  "1 year",
  "filters": [
    [
      { "col": "order_date",   "op": "BETWEEN", "values": ["2024-01-01", "2024-12-31"] },
      { "col": "order_status", "op": "IN",      "values": ["OPEN", "PENDING"] }
    ],
    [
      { "col": "order_status", "op": "IS NULL" }
    ]
  ],
  "pf_user":  "admin",
  "note":     "FY2024 actuals + open orders projected to FY2025",
  "replay":   false
}

The example above generates: (order_date BETWEEN '2024-01-01' AND '2024-12-31' AND order_status IN ('OPEN','PENDING')) OR (order_status IS NULL)

  • date_offset — PostgreSQL interval string applied to the primary role = 'date' column at insert time. Examples: "1 year", "6 months", "2 years 3 months". Defaults to "0 days". Applied to the stored date value only — filter columns are never shifted.
  • filters — an array of groups. Conditions within a group are AND-ed; groups are OR-ed together. Each group is an array of one or more condition objects:
    • col — must be role = 'date' or role = 'filter' in col_meta
    • op — one of =, !=, IN, NOT IN, BETWEEN, IS NULL, IS NOT NULL
    • values — array of strings; two elements for BETWEEN; multiple for IN/NOT IN; omitted for IS NULL/IS NOT NULL
    • Backward compatibility: a flat array of condition objects (non-nested) is treated as a single group (all AND).
  • At least one group with at least one condition is required.
  • raw_where — optional string. When present, bypasses filters entirely and injects the value verbatim as the WHERE clause body. Admin-only — rejected with 403 if the requesting pf_user is not in the admin list. Not validated against col_meta. Caller is responsible for correctness and SQL safety. Stored as-is in pf.log.params for audit. Cannot be combined with filters — if both are present the request is rejected with 400.
  • Baseline loads are additive — existing iter = 'baseline' rows are not touched. Each load is its own log entry and is independently undoable.

replay controls behavior when incremental rows exist (applies to Clear + reload, not individual segments):

  • replay: false (default) — after clearing, re-load baseline segments, leave incremental rows untouched
  • replay: true — after clearing, re-load baseline, then re-execute each incremental log entry in chronological order

v1 note: replay: true returns 501 Not Implemented until the replay engine is built.

Clear baseline (DELETE /api/versions/:id/baseline) — deletes all rows where iter = 'baseline' and all operation = 'baseline' log entries. Irreversible (no undo). Returns { rows_deleted, log_entries_deleted }.

Reference request body: same shape as baseline load without replay. Reference dates land verbatim (no offset). Additive — multiple reference loads stack independently, each undoable by logid.

Forecast Data

Method Route Description
GET /api/versions/:id/data Stream all rows for this version as an Arrow IPC binary

Transport format — Apache Arrow IPC stream

The endpoint returns Content-Type: application/vnd.apache.arrow.stream (binary). JSON is not used for this route. The client fetches the response as arrayBuffer() and passes it directly to worker.table(buffer) — Perspective's native ingestion path with no JS deserialization overhead.

Arrow's columnar layout with dictionary encoding on string dimension columns keeps payload size manageable at scale (typically 50150 MB for 1M rows depending on string cardinality), compared to several times that for equivalent JSON.

Server-side streaming (cursor-based)

For datasets that may reach 1M+ rows, the server must not buffer the full query result in memory before writing the response. Instead:

  1. Open a PostgreSQL cursor over the SELECT * FROM {{fc_table}} query
  2. Fetch rows in batches (target: 10 000 rows per batch)
  3. For each batch, append a serialized Arrow record batch to the HTTP response using chunked transfer encoding
  4. Close the cursor and end the response when all batches are written

This means the first bytes of the Arrow stream reach the client while the server is still reading from the database, and Node.js heap stays bounded regardless of dataset size.

Client-side loading

  • Moderate datasets (< ~500k rows): accumulate the full arrayBuffer() then call worker.table(buffer) once. Perspective becomes interactive after the stream completes.
  • Large datasets (≥ ~500k rows): process Arrow record batches incrementally — call worker.table(firstBatch) to create the table, then pspTable.update(batch) for each subsequent batch. Perspective is interactive and browseable while remaining batches are still arriving.

The client detects which path to use by checking the X-Row-Count response header (see below).

Row-count pre-check

Before opening the cursor, the server runs SELECT COUNT(*) FROM {{fc_table}}. The result is attached as the X-Row-Count response header so the client can choose its loading strategy. If the count exceeds 500 000, the UI displays a non-blocking notice ("Loading large dataset — pivot will become interactive as data arrives") rather than a blank screen.

Forecast Operations

All operations share a common request envelope:

{
  "pf_user": "paul.trowbridge",
  "note":    "optional comment",
  "slice": {
    "channel":   "WHS",
    "geography": "WEST"
  }
}

slice keys must be role = 'dimension' columns per col_meta. Stored in pf.log as the implicit link to affected rows.

Scale

POST /api/versions/:id/scale

{
  "pf_user":    "paul.trowbridge",
  "note":       "10% volume lift Q3 West",
  "slice":      { "channel": "WHS", "geography": "WEST" },
  "value_incr": null,
  "units_incr": 5000,
  "pct":        false
}
  • value_incr / units_incr — absolute amounts to add (positive or negative). Either can be null.
  • pct: true — treat as percentage of current slice total instead of absolute
  • Excludes exclude_iters rows from the source selection
  • Distributes increment proportionally across rows in the slice
  • Inserts rows tagged iter = 'scale'

Recode

POST /api/versions/:id/recode

{
  "pf_user": "paul.trowbridge",
  "note":    "Part discontinued, replaced by new SKU",
  "slice":   { "part": "OLD-SKU-001" },
  "set":     { "part": "NEW-SKU-002" }
}
  • set — one or more dimension fields to replace (can swap multiple at once)
  • Inserts negative rows to zero out the original slice
  • Inserts positive rows with replaced dimension values
  • Both sets of rows share the same logid — undone together
  • Inserts rows tagged iter = 'recode'

Clone

POST /api/versions/:id/clone

{
  "pf_user": "paul.trowbridge",
  "note":    "New customer win, similar profile to existing",
  "slice":   { "customer": "EXISTING CO", "channel": "DIR" },
  "set":     { "customer": "NEW CO" },
  "scale":   0.75
}
  • set — dimension values to override on cloned rows
  • scale — optional multiplier on value/units (default 1.0)
  • Does not offset original slice
  • Inserts rows tagged iter = 'clone'

Audit & Undo

Method Route Description
GET /api/versions/:id/log List all log entries for a version, newest first
DELETE /api/log/:logid Undo: delete all forecast rows with this logid, then delete log entry

Frontend (Web UI)

Navigation (sidebar)

Three-step collapsible sidebar (200 px expanded / 48 px collapsed, state persisted to localStorage):

  1. ① Setup — browse DB tables, register sources, configure col_meta, generate SQL. One-time admin task.
  2. ② Baseline — create/manage versions, load baseline segments, timeline preview. One-time per version.
  3. ③ Forecast — main working view: Perspective pivot + operation panel. Primary ongoing use.

Setup View (① Setup)

  • Left panel: DB table browser — all tables with row counts; click a table to open a preview modal (column list + sample rows)
  • Right panel: Registered sources list; click a source to open col_meta editor below
  • Col_meta editor: inline table — role dropdown per column, is_key checkbox, label text input, ordinal position
  • "Save" button — upserts col_meta; "Generate SQL" button — triggers generate-sql route, shows confirmation
  • "Register source" button available in the table preview modal
  • New columns default to role dimension on registration
  • Must generate SQL before a version can be created against this source

Baseline View (② Baseline)

Source and version selectors at top. Version management inline: create new version (explains that a forecast table will be created), Close / Reopen / Delete buttons. Delete drops the forecast table and removes all version records.

Baseline Workbench

A dedicated view for constructing the baseline for the selected version. The baseline is built from one or more segments — each segment is an independent query against the source table that appends rows to iter = 'baseline'. Segments are additive; clearing is explicit.

Layout:

┌─────────────────────────────────────────────────────────────┐
│  Baseline — [Version name]              [Clear Baseline]     │
├─────────────────────────────────────────────────────────────┤
│  Segments loaded (from log):                                 │
│  ┌──────┬────────────────┬──────────┬───────┬──────────┐    │
│  │  ID  │  Description   │  Rows    │  By   │  [Undo]  │    │
│  └──────┴────────────────┴──────────┴───────┴──────────┘    │
├─────────────────────────────────────────────────────────────┤
│  Add Segment                                                 │
│                                                              │
│  Description  [_______________________________________]      │
│                                                              │
│  Date range   [date_from] to [date_to]  on [date col ▾]     │
│  Date offset  [0] years  [0] months                         │
│                                                              │
│  Additional filters:                                         │
│  [ + Add filter ]                                            │
│  ┌──────────────────┬──────────┬──────────────┬───────┐     │
│  │  Column          │  Op      │  Value(s)    │  [ x ]│     │
│  └──────────────────┴──────────┴──────────────┴───────┘     │
│                                                              │
│  Preview: [projected month chips]                            │
│                                                              │
│  Note  [___________]          [Load Segment]                 │
└─────────────────────────────────────────────────────────────┘

Segments list — shows all operation = 'baseline' log entries for this version, newest first. Each has an Undo button. Undo removes only that segment's rows (by logid), leaving other segments intact.

Clear Baseline — deletes ALL iter = 'baseline' rows and all operation = 'baseline' log entries for this version. Prompts for confirmation. Used when starting over from scratch.

Add Segment form:

  • Description — free text label stored as the log note, shown in the segments list
  • Date offset — years + months spinners; shifts the primary role = 'date' column forward on insert
  • Filters — one or more filter groups that define what rows to pull. Conditions within a group are AND-ed; groups are OR-ed. There is no separate "date range" section — period selection is just a filter like any other:
    • Each group has a header row ("Group 1", "Group 2 — OR", …) and a + Add condition link
    • Within a group: Column (any role = 'date' or role = 'filter'), Operator (=, !=, IN, NOT IN, BETWEEN, IS NULL, IS NOT NULL), Value(s)
    • Value inputs: BETWEEN → two date/text inputs; IN/NOT IN → comma-separated list; =/!= → single input; omitted for IS NULL/IS NOT NULL
    • + Add OR group button appends a new empty group below, joined by an "OR" separator label
    • Groups with more than one condition render an "AND" badge between rows to make the logic explicit
    • A group can be removed with × on its header (not available when only one group remains)
    • At least one group with at least one condition is required to load a segment
  • Manual WHERE clause (admin only) — a toggle link ("Switch to manual SQL") that replaces the filter builder with a plain textarea. The admin types a raw PostgreSQL WHERE clause body (no WHERE keyword). Switching back to the builder clears the textarea. When active, the filter builder is hidden and the structured filters field is not sent; raw_where is sent instead. A prominent warning banner reads: "Raw SQL is not validated. You are responsible for correctness and security."
  • Timeline preview — rendered when any condition in any group is a BETWEEN or = on a role = 'date' column. Shows a horizontal bar (number-line style) for the source period and, if offset > 0, a second bar below for the projected period. Each bar shows start date on the left, end date on the right, duration in the centre. The two bars share the same visual width so the shift is immediately apparent. Not shown in manual WHERE mode or when no date condition is present.
  • Note — optional free text
  • Load Segment — submits; appends rows, does not clear existing baseline rows

Example — three-segment baseline:

# Description Filter logic Offset
1 All orders taken 6/1/253/31/26 order_date BETWEEN 2025-06-01 AND 2026-03-31 0
2 Open or unshipped orders (status missing or explicit) (status IN ('OPEN','PENDING')) OR (status IS NULL) 0
3 Prior year book-and-ship 4/1/255/31/25 order_date BETWEEN 2025-04-01 AND 2025-05-31 AND ship_date BETWEEN 2025-04-01 AND 2025-05-31 0

Segment 2 uses two OR groups; segment 3 has two AND conditions in one group. Any combination is valid as long as at least one group with at least one condition is present.

Forecast View

Layout:

┌─────────────────────────────────────────────────────────────────┐
│  [Version label]  [Refresh]  [Save layout]  [Reset layout]       │
├──────────────────────────────────────┬──────────────────────────┤
│                                      │                           │
│  Perspective Viewer                  │  Operation Panel          │
│  (interactive pivot web component)   │  (active when slice set)  │
│                                      │                           │
│                                      │  Slice:                   │
│                                      │    channel = WHS          │
│                                      │    geography = WEST       │
│                                      │                           │
│                                      │  [ Scale ] [ Recode ]     │
│                                      │  [ Clone ]                │
│                                      │                           │
│                                      │  ... operation form ...   │
│                                      │                           │
│                                      │  [ Submit ]               │
│                                      │                           │
└──────────────────────────────────────┴──────────────────────────┘

Pivot control: Perspective 4.4.0, loaded from CDN at runtime. Data is fetched from GET /api/versions/:id/data as an Arrow IPC binary stream and loaded into an in-browser Perspective worker — Perspective's native ingestion path. Supports grouping, splitting, filtering, sorting, and charting interactively. Layout (group_by, split_by, filters, plugin) is saved per version to localStorage via Save layout / Reset layout buttons.

Large-dataset loading sequence:

  1. Client issues GET /api/versions/:id/data
  2. Server responds with X-Row-Count header and begins streaming Arrow record batches
  3. If X-Row-Count ≥ 500 000, UI shows a non-blocking loading banner; otherwise no indicator
  4. Client calls worker.table(firstBatch) on the first batch to make the pivot interactive immediately
  5. Each subsequent batch is applied with pspTable.update(batch) as it arrives
  6. Banner clears when the stream closes

Interaction flow:

  1. Click a cell or row in the pivot — the perspective-click event fires
  2. detail.config.filter from the event is parsed: only == filters on role = dimension columns are extracted as the slice
  3. Slice populates the Operation Panel — pick operation tab, fill in parameters
  4. Submit → POST to API → new rows returned via RETURNING * are streamed directly into the Perspective table (pspTable.update(rows)) — no full reload needed
  5. For recode, both the negative offset rows and positive replacement rows are returned and streamed

Pivot default layout: built from col_meta — first two dimension columns as group_by, date column as split_by. User can rearrange in Perspective settings panel and save.

Reference rows (pf_iter = 'reference') are visible in the pivot for comparison context. Operations never affect them (enforced by exclude_iters in the version).

Log View

AG Grid list of log entries — user, timestamp, operation, slice, note, rows affected. "Undo" button per row → DELETE /api/log/:logid → grid and pivot refresh (full reload of Perspective table).


Forecast SQL Patterns

Column names baked in at generation time. Tokens substituted at request time. Metadata columns are pf_iter, pf_logid, pf_user, pf_created_at.

Units conditionality: {units_col} appears in INSERT column lists and SELECT expressions only when a units role is configured in col_meta. The SQL generator omits it entirely otherwise — no placeholder column, no zero-fill.

dim_period JOIN: when any dimension column has dim_period_col set (and its group's date key has is_key = true), the FROM clause becomes {schema}.{tname} s JOIN pf.dim_period dp ON dp.drange @> (s.{date_col} + '{{date_offset}}'::interval)::date. Those dimension columns are selected as dp.{dim_period_col} AS {col} instead of s.{col}.

Baseline Load (one segment)

WITH ilog AS (
    INSERT INTO pf.log (version_id, pf_user, operation, slice, params, note)
    VALUES ({{version_id}}, '{{pf_user}}', 'baseline', NULL, '{{params}}'::jsonb, '{{note}}')
    RETURNING id
)
,ins AS (
    INSERT INTO {{fc_table}} (
        {dimension_cols}, {date_col}, {value_col} [, {units_col}],
        pf_iter, pf_logid, pf_user, pf_created_at
    )
    SELECT
        {dimension_cols},
        ({date_col} + '{{date_offset}}'::interval)::date,
        {value_col} [, {units_col}],
        'baseline', (SELECT id FROM ilog), '{{pf_user}}', now()
    FROM
        {schema}.{tname}  -- or with dim_period JOIN (see above)
    WHERE
        {{filter_clause}}
    RETURNING *
)
SELECT count(*) AS rows_affected FROM ins

Baseline loads are additive — no DELETE before INSERT. Each segment appends independently.

Token details:

  • {{date_offset}} — PostgreSQL interval string (e.g. 1 year); defaults to 0 days; applied only to the primary role = 'date' column on insert
  • {{filter_clause}} — built from filters or raw_where at request time (not baked into stored SQL since conditions vary per segment).
    • Structured path (filters): each group becomes a parenthesized AND block; groups are joined with OR. Every column is validated against col_meta (role = 'date' or role = 'filter'). Values are escaped (single quotes doubled). Supported operators: =, !=, IN, NOT IN, BETWEEN, IS NULL, IS NOT NULL.
    • Raw path (raw_where): the string is injected verbatim. No col_meta validation. Admin-only.

Clear Baseline

Two queries, run in a transaction:

DELETE FROM {{fc_table}} WHERE pf_iter = 'baseline';
DELETE FROM pf.log WHERE version_id = {{version_id}} AND operation = 'baseline';

Reference Load

WITH ilog AS (
    INSERT INTO pf.log (version_id, pf_user, operation, slice, params, note)
    VALUES ({{version_id}}, '{{pf_user}}', 'reference', NULL, '{{params}}'::jsonb, '{{note}}')
    RETURNING id
)
,ins AS (
    INSERT INTO {{fc_table}} (
        {dimension_cols}, {date_col}, {value_col} [, {units_col}],
        pf_iter, pf_logid, pf_user, pf_created_at
    )
    SELECT
        {dimension_cols}, {date_col}, {value_col} [, {units_col}],
        'reference', (SELECT id FROM ilog), '{{pf_user}}', now()
    FROM
        {schema}.{tname}  -- or with dim_period JOIN (see above)
    WHERE
        {{filter_clause}}
    RETURNING *
)
SELECT count(*) AS rows_affected FROM ins

No date offset applied — reference rows land at their original dates for prior-period comparison. Same dim_period JOIN logic applies as baseline.

Scale

WITH ilog AS (
    INSERT INTO pf.log (version_id, pf_user, operation, slice, params, note)
    VALUES ({{version_id}}, '{{pf_user}}', 'scale', '{{slice}}'::jsonb, '{{params}}'::jsonb, '{{note}}')
    RETURNING id
)
,base AS (
    SELECT
        {dimension_cols}, {date_col},
        {value_col} [, {units_col}],
        sum({value_col}) OVER () AS total_value
        [, sum({units_col}) OVER () AS total_units]
    FROM {{fc_table}}
    WHERE {{where_clause}}
    {{exclude_clause}}
)
,ins AS (
    INSERT INTO {{fc_table}} (
        {dimension_cols}, {date_col}, {value_col} [, {units_col}],
        pf_iter, pf_logid, pf_user, pf_created_at
    )
    SELECT
        {dimension_cols}, {date_col},
        round(({value_col} / NULLIF(total_value, 0)) * {{value_incr}}, 2)
        [, round(({units_col} / NULLIF(total_units, 0)) * {{units_incr}}, 5)],
        'scale', (SELECT id FROM ilog), '{{pf_user}}', now()
    FROM base
    RETURNING *
)
SELECT * FROM ins

{{value_incr}} / {{units_incr}} are pre-computed in JS when pct: true (multiply slice total by pct). Units expressions are omitted when no units column is configured.

Recode

WITH ilog AS (
    INSERT INTO pf.log (version_id, pf_user, operation, slice, params, note)
    VALUES ({{version_id}}, '{{pf_user}}', 'recode', '{{slice}}'::jsonb, '{{params}}'::jsonb, '{{note}}')
    RETURNING id
)
,src AS (
    SELECT {dimension_cols}, {date_col}, {value_col} [, {units_col}]
    FROM {{fc_table}}
    WHERE {{where_clause}}
    {{exclude_clause}}
)
,neg AS (
    INSERT INTO {{fc_table}} ({dimension_cols}, {date_col}, {value_col} [, {units_col}], pf_iter, pf_logid, pf_user, pf_created_at)
    SELECT {dimension_cols}, {date_col}, -{value_col} [, -{units_col}], 'recode', (SELECT id FROM ilog), '{{pf_user}}', now()
    FROM src
    RETURNING *
)
,ins AS (
    INSERT INTO {{fc_table}} ({dimension_cols}, {date_col}, {value_col} [, {units_col}], pf_iter, pf_logid, pf_user, pf_created_at)
    SELECT {{set_clause}}, {date_col}, {value_col} [, {units_col}], 'recode', (SELECT id FROM ilog), '{{pf_user}}', now()
    FROM src
    RETURNING *
)
SELECT * FROM neg UNION ALL SELECT * FROM ins

{{set_clause}} replaces the listed dimension columns with new values, passes others through unchanged. Both the negative (zero-out) and positive (replacement) rows share the same pf_logid and are undone together.

Clone

WITH ilog AS (
    INSERT INTO pf.log (version_id, pf_user, operation, slice, params, note)
    VALUES ({{version_id}}, '{{pf_user}}', 'clone', '{{slice}}'::jsonb, '{{params}}'::jsonb, '{{note}}')
    RETURNING id
)
,ins AS (
    INSERT INTO {{fc_table}} ({dimension_cols}, {date_col}, {value_col} [, {units_col}], pf_iter, pf_logid, pf_user, pf_created_at)
    SELECT
        {{set_clause}}, {date_col},
        round({value_col} * {{scale_factor}}, 2)
        [, round({units_col} * {{scale_factor}}, 5)],
        'clone', (SELECT id FROM ilog), '{{pf_user}}', now()
    FROM {{fc_table}}
    WHERE {{where_clause}}
    {{exclude_clause}}
    RETURNING *
)
SELECT * FROM ins

Undo

Two queries run sequentially (not in a CTE — FK ordering):

DELETE FROM {{fc_table}} WHERE pf_logid = {{logid}};
DELETE FROM pf.log WHERE id = {{logid}};

Admin Setup Flow (end-to-end)

  1. Open Sources view → browse DB tables → register source table
  2. Open col_meta editor → assign roles to columns (dimension, value, units, date, filter, ignore), mark is_key dimensions, set labels
  3. Click Generate SQL → app writes operation SQL to pf.sql
  4. Open Versions view → create a named version (sets exclude_iters, creates forecast table)
  5. Open Baseline Workbench → build the baseline from one or more segments:
    • Each segment specifies a date range (on any date/filter column), date offset, and optional additional filter conditions
    • Add segments until the baseline is complete; each is independently undoable
    • Use "Clear Baseline" to start over if needed
  6. Optionally load Reference → pick prior period date range → inserts iter = 'reference' rows at their original dates (for comparison in the pivot)
  7. Open Forecast view → share with users

User Forecast Flow (end-to-end)

  1. Open Forecast view → select version
  2. Pivot loads — explore data, identify slice to adjust
  3. Select cells → Operation Panel populates with slice
  4. Choose operation → fill in parameters → Submit
  5. Grid refreshes — adjustment visible immediately
  6. Repeat as needed
  7. Admin closes version when forecasting is complete

Open Questions / Future Scope

  • Baseline replay — re-execute change log against a restated baseline (replay: true); v1 returns 501
  • Approval workflow — user submits, admin approves before changes are visible to others (deferred)
  • Territory filtering — restrict what a user can see/edit by dimension value (deferred)
  • Export — download forecast as CSV or push results to a reporting table
  • Version comparison — side-by-side view of two versions (facilitated by isolated tables via UNION)
  • Col meta / version schema drift — if col_meta roles are changed after a version's forecast table is already created, the generated SQL and the table DDL go out of sync. UI should detect this: compare col_meta against the forecast table's actual columns via information_schema, warn the user, and offer to rebuild the version (drop + recreate table, preserving the version record and log). Workaround: delete and recreate the version manually.
  • Multi-connection support — currently one DB via .env. Full vision: pf.connection table (host, port, dbname, user, password as env-var ref), connection_id on pf.source, per-connection pg pools at runtime. pf schema stays on a "home" connection; source data can live anywhere. Connections UI in Setup. Safe to defer while in dev — requires clean reinstall when added since it changes the source schema.

Project Status — 2026-06-12

What's working

  • Full backend: source registration, col_meta, SQL generation, versions, baseline segments, reference load, scale, recode, clone, undo
  • units column is optional — sources without a units column register and generate SQL correctly
  • dim_group / dim_period_col on col_meta: baseline/reference load JOINs pf.dim_period to derive fiscal/calendar period columns rather than copying them raw from the source
  • pf.dim_period calendar table (20182035): populated by setup_sql/gen_dim_period.sql, configurable fiscal year start
  • React + Vite + Tailwind CSS frontend in ui/, built output to public/app/, served by Express
  • Data transport: Arrow IPC binary stream (GET /api/versions/:id/data); server accumulates all rows into one record batch; client hands buffer directly to Perspective WASM
  • 3-step collapsible sidebar (Setup / Baseline / Forecast)
  • Setup view: DB table browser with preview modal, source registration, col_meta editor (dim_group/dim_period_col fields included), SQL generation
  • Baseline view: version management (create/close/reopen/delete), multi-segment baseline workbench, canvas timeline, filter builder
  • Perspective pivot in Forecast view: loads all version rows, interactive group/split/filter/chart, layout saved per version to localStorage
  • Slice extraction from perspective-click event feeds operation panel directly
  • Incremental row streaming: operation results (RETURNING *) applied to Perspective table via pspTable.update() — no full reload
  • Status bar: shows current source · version · baseline row count · status

Known issues / next focus

  • Forecast view — operation panel SQL generation complete; UI wiring to API still needed
  • Load progress bar — jittery at high throughput; throttle to ~10 updates/sec
  • Default pivot layout — per-source configurable layout not yet implemented; currently hardcodes first 2 dimensions
  • No "current version" persistence — source/version selection resets on page reload
  • Perspective slice limitation — computed date columns (Month, YearDate) from split_by don't map back to raw rows; only native dimension columns work for slice extraction
  • Col_meta / version schema drift — if col_meta changes after a version's forecast table is created, SQL and DDL go out of sync. Workaround: delete and recreate the version.