Compare commits

..

24 Commits

Author SHA1 Message Date
24675feb49 Mappings: clear regex filter after Save all
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-15 20:33:54 -04:00
dc32060c42 Add global Remap page for bulk output value replacement
- SQL: search_mapping_outputs(search) — distinct (col, val, count) groups
         get_mappings_by_output_field(col, val) — individual mappings
         remap_output_field(col, from, to) — bulk UPDATE via jsonb_set
- API: GET /mappings/outputs?search=, GET /mappings/outputs/:col/:val,
       POST /mappings/remap-field
- UI: Remap page — search output values, click to select, edit the
  replacement value, see all affected mappings, apply globally
- Nav: Remap added between Mappings and Records

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-15 20:22:52 -04:00
bda59c7675 Docs: update Pivot spec section and add Perspective technical reference
SPEC.md: rewrite Pivot page description to cover named layouts, depth
control, selection mode, inspector filtering, and layout persistence.

docs/perspective-pivot.md: new file documenting all discovered Perspective
v4.4.0 APIs — viewer/plugin/view methods, selection modes, set_depth
mechanism, perspective-click event shape, full state save/restore pattern,
and common pitfalls.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-15 08:58:36 -04:00
420bc1bbe8 Pivot: round numbers in inspector to 2 decimals with adjustable precision
formatVal now rounds numeric values using toLocaleString with
configurable decimal places (default 2, range 0-8). Adds -/+ controls
in the inspector header to adjust on the fly.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-15 08:54:34 -04:00
8d3cc24094 Pivot: skip inspector query when no group_by hierarchy is active
Without group_by there are no coordinate filters, so the view query
would return the full dataset and hang. Early-return on click if
config.group_by is empty.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-15 08:52:09 -04:00
6e9cdd82ea Pivot: widen detail pane from w-80 to w-96
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-15 08:43:47 -04:00
ed07dde492 Pivot: default settings panel to hidden on fresh load
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-15 08:42:07 -04:00
7c07434049 Pivot: save/restore edit mode and expand depth in named layouts
- Default selection mode is now SELECT_REGION
- plugin.save()/restore() used to capture and apply edit mode
- expand_depth tracked in ref and included in layout config
- applyExpandDepth helper restores depth on layout recall and page load
- Save button overwrites active layout in place (no re-typing name)
- captureConfig() helper shared by save-over and save-as flows

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-15 08:40:24 -04:00
b88795b015 Clean up expand depth control into proper toolbar UI
Replace debug test buttons with a minimal 'depth: 0 1 2 3' control
in the pivot toolbar right side.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-15 08:05:47 -04:00
3a172e2456 Find working expand depth control: view.set_depth + plugin.draw
After testing plugin_config.expand_depth (no effect) and view.set_depth
+ flush() (no effect), confirmed that view.set_depth(d) followed by
plugin.draw(view) correctly collapses/expands all rows to depth d.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-15 08:05:10 -04:00
0b8c2935d7 Add expand_depth test buttons to Pivot toolbar
Temporary UI for testing programmatic row expansion control via
plugin_config.expand_depth in Perspective viewer.restore().

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-15 07:55:37 -04:00
3723778cbb Pivot: named layouts saved in DB per source
- pivot_layouts table (source_name, layout_name, config JSONB)
- list/save/delete SQL functions and API routes
- Pivot toolbar above viewer: layout chips, save-as inline input,
  delete per layout, reset to default
- Applying a named layout also updates localStorage working state
- Layouts reload on source change

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-15 07:31:46 -04:00
23fa14f22c Pivot: move save layout button to top-left
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-15 07:28:52 -04:00
c98efe58d1 Pivot: show all row metrics in inspector, highlight clicked cell
Always display all non-null metric columns from the clicked row.
When a specific cell can be identified (split_by in use, cell mode),
highlight that row in blue/bold. Fixes row mode showing only one value.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-14 23:02:04 -04:00
ec0cc73f31 SPEC: add Pivot and Log pages, update file structure
Document the Perspective-based pivot viewer, cell inspector
behavior, layout persistence, and row matching approach.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-14 22:54:40 -04:00
fb9ff8720a Pivot: use event filters for row matching, skip computed columns
Replace __ROW_PATH__ zip approach with direct application of
perspective-click event filters against raw rows. Fields not
present in the raw data (Perspective computed columns like Month,
YearDate) are skipped. Also removes debug console.log calls.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-14 22:51:48 -04:00
1587d06967 Pivot: add debug logging for cell click investigation
Temporary logs to inspect perspective-click event detail.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-14 22:41:55 -04:00
f7d73ad821 Pivot: clean up click inspector upper pane display
Show row path prominently, filter to non-null metric values,
use group_by › split_by as section header.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-14 22:31:56 -04:00
1631dbd2cc Pivot: fix slice filtering by zipping __ROW_PATH__ with group_by columns
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-14 22:14:45 -04:00
7ec571635a Pivot: improve filterRows normalization for pivoted cells
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-14 22:07:29 -04:00
e3ceb70fc6 Pivot: row select default, click inspector with underlying rows
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-14 22:03:58 -04:00
ebd88a2df8 Source setup UX, Pivot page, and import/view fixes
- Fix stale import_records in sources.sql that referenced deleted generate_constraint_key
- Auto-transform after import, auto-generate view after create
- New source form matches existing source layout (In view, Seq, type dropdown)
- Sample data table (50 rows) shown below field config in both new and existing source views
- Import sample CSV on create (checked by default)
- Sortable column headers on field table
- Choose CSV styled as a button showing filename
- + button in sidebar opens new source form
- Records tab shows error message when view cast fails instead of blank
- Pivot page with Perspective viewer, per-source saved layouts

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-14 21:31:44 -04:00
d495ef2fc5 Records filters, global picklist, autocomplete, and rule reprocess
- Records tab: regex filter bar (postgres ~*), add/remove filters, debounced,
  ANDed together; get_view_data gains p_filters JSONB param
- Global picklist: sources.global_picklist flag (default true) controls whether
  a source's mapped output values feed the cross-source autocomplete suggestion pool;
  toggle on Sources page; get_global_output_values() SQL function
- Mappings: replace native datalist with custom AutocompleteInput component —
  Alt+Down opens, Tab cycles, Enter selects, arrow keys navigate, Escape closes
- Rules: auto-reprocess source records when a rule is created or updated
- preview_rule: fix BIGINT/INT return type mismatch
- Stale get_import_log removed from sources.sql
- TSV export: fetch with auth headers instead of plain <a href> (fixes 401)
- + column button: more visible styling

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-14 16:28:26 -04:00
d63d70cd52 Import log, constraint key overhaul, and dedup improvements
- Rename dedup_key/dedup_fields → constraint_key/constraint_fields everywhere
  (schema, functions, routes, UI, migration script, docs)
- Change constraint_key from MD5 TEXT hash to readable JSONB object
- Drop unique constraint on (source_name, constraint_key); dedup is now
  enforced at import time via CTE, allowing intra-file duplicate rows
- Add import_id FK (ON DELETE CASCADE) so deleting a log entry removes its records
- Add info JSONB to import_log with inserted_keys and excluded_keys arrays
- Add get_import_log, get_all_import_logs, delete_import SQL functions
- Auto-apply transformations immediately after import
- Import UI: expandable key detail, checkbox selection, delete with confirm,
  import ID column, transform result display
- New Log page: global import log across all sources

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-13 23:44:30 -04:00
21 changed files with 1967 additions and 362 deletions

View File

@ -19,7 +19,7 @@ Dataflow is a simple data transformation tool for importing, cleaning, and stand
### Database Schema (`database/schema.sql`)
**5 simple tables:**
- `sources` - Source definitions with `dedup_fields` array
- `sources` - Source definitions with `constraint_fields` array
- `records` - Imported data with `data` (raw) and `transformed` (enriched) JSONB columns
- `rules` - Regex extraction rules with `field`, `pattern`, `output_field`
- `mappings` - Input/output value mappings
@ -123,9 +123,11 @@ records.data → apply_transformations() →
```
### Deduplication
- Hash is MD5 of concatenated values from `dedup_fields`
- Unique constraint on `(source_name, dedup_key)` prevents duplicates
- Import function catches unique violations and counts them
- `constraint_key` is a JSONB object of the constraint field values (readable, no hashing)
- Dedup is enforced at import time via CTE — no unique DB constraint
- Intra-file duplicate rows are allowed (bank may send identical rows); they all insert
- On re-import, all rows whose constraint_key already exists in the DB are skipped
- Deleting an import log entry cascades to all records from that batch (import_id FK)
### Error Handling
- API routes use `try/catch` and pass errors to `next(err)`
@ -184,7 +186,7 @@ The simplification makes it easy to understand, modify, and maintain.
- Check for SQL errors in logs
**All records marked as duplicates:**
- Verify `dedup_fields` match actual field names in data
- Verify `constraint_fields` match actual field names in data
- Check if data was already imported
- Use different source name for testing

44
SPEC.md
View File

@ -61,6 +61,8 @@ ui/
Rules.jsx — rule CRUD with live pattern preview
Mappings.jsx — mapping table with TSV import/export
Records.jsx — paginated, sortable view of transformed records
Pivot.jsx — interactive pivot table with cell inspector
Log.jsx — global import log across all sources
public/ — compiled UI (output of npm run build in ui/)
```
@ -71,10 +73,10 @@ public/ — compiled UI (output of npm run build in ui/)
Five tables in the `dataflow` schema:
### `sources`
Defines a data source. The `dedup_fields` array specifies which fields make a record unique. `config` (JSONB) holds the output schema (`fields` array) used to generate the typed view.
Defines a data source. The `constraint_fields` array specifies which fields make a record unique. `config` (JSONB) holds the output schema (`fields` array) used to generate the typed view.
### `records`
Stores every imported record. `data` holds the raw import. `transformed` holds the enriched record after rules and mappings are applied. `dedup_key` is an MD5 hash of the dedup fields — a unique constraint on `(source_name, dedup_key)` prevents duplicate imports.
Stores every imported record. `data` holds the raw import. `transformed` holds the enriched record after rules and mappings are applied. `constraint_key` is a JSONB object of the constraint field values used to detect duplicates at import time. `import_id` references the `import_log` row; deleting a log entry cascades to its records.
### `rules`
Regex transformation rules. Each rule reads from `field`, applies `pattern` with optional `flags`, and writes to `output_field`. `function_type` is either `extract` (regexp_matches) or `replace` (regexp_replace). `sequence` controls the order rules are applied. `retain` keeps the raw extracted value in `output_field` even when a mapping overrides it.
@ -83,7 +85,7 @@ Regex transformation rules. Each rule reads from `field`, applies `pattern` with
Maps an extracted value to a standardized output object. `input_value` is JSONB (matches the extracted value exactly, including arrays from multi-capture-group patterns). `output` is a JSONB object that can contain multiple fields (e.g., `{"vendor": "Walmart", "category": "Groceries"}`).
### `import_log`
Audit trail. One row per import call, recording how many records were inserted versus skipped as duplicates.
Audit trail. One row per import call, recording how many records were inserted versus skipped as duplicates. `info` (JSONB) stores the full `inserted_keys` and `excluded_keys` arrays. Deleting a log row cascades to its records via the `import_id` FK.
---
@ -92,8 +94,11 @@ Audit trail. One row per import call, recording how many records were inserted v
### Import
```
CSV file → parse in Node.js → import_records(source, data)
→ generate_dedup_key() per record → INSERT with unique constraint
→ count inserted vs duplicates → log to import_log
→ build JSONB constraint_key per record
→ compare against existing records (CTE — no unique constraint)
→ INSERT new records, skip duplicates
→ log to import_log (with inserted_keys / excluded_keys)
→ apply_transformations() runs automatically on new records
```
### Transform
@ -145,7 +150,7 @@ All routes are under `/api`. Every route requires HTTP Basic Auth. The `GET /hea
| GET | /api/sources | List all sources |
| POST | /api/sources | Create source |
| GET | /api/sources/:name | Get source |
| PUT | /api/sources/:name | Update source (dedup_fields, config) |
| PUT | /api/sources/:name | Update source (constraint_fields, config) |
| DELETE | /api/sources/:name | Delete source and all data |
| POST | /api/sources/suggest | Suggest source config from CSV upload |
| POST | /api/sources/:name/import | Import CSV records |
@ -203,15 +208,36 @@ Built with React + Vite + Tailwind CSS. Compiled output goes to `public/`. The s
**Pages:**
- **Sources** — View and edit source configuration. Shows all known field names and their origins (raw data, schema, rules, mappings). Checkboxes control which fields are dedup keys and which appear in the output view. Supports CSV upload to auto-detect fields.
- **Sources** — View and edit source configuration. Shows all known field names and their origins (raw data, schema, rules, mappings). Checkboxes control which fields are constraint fields and which appear in the output view. Supports CSV upload to auto-detect fields.
- **Import** — Upload a CSV to import records into the selected source. Shows import log with inserted/duplicate counts per import.
- **Import** — Upload a CSV to import records into the selected source. Transformations run automatically on new records. Shows import log with inserted/duplicate counts, expandable key detail, checkbox selection, and delete with confirmation.
- **Rules** — Create and manage regex rules. Live preview fires automatically (debounced 500ms) as pattern/field/flags are edited, showing match results against real records. Rules can be enabled/disabled by toggle.
- **Mappings** — Tabular mapping editor. Shows all extracted values from transformed records with record counts and sample raw data. Rows are yellow (unmapped), white (mapped), or blue (edited but unsaved). Supports TSV export and import. Columns can be added dynamically.
- **Records** — Paginated table showing the `dfv.{source}` view. Server-side sorting (column validated against `information_schema.columns`, interpolated with `quote_ident`). Dates are formatted `YYYY-MM-DD` for correct lexicographic sort.
- **Records** — Paginated table showing the `dfv.{source}` view. Server-side sorting (column validated against `information_schema.columns`, interpolated with `quote_ident`). Dates are formatted `YYYY-MM-DD` for correct lexicographic sort. Regex filters can be added per column. If the view cast fails (e.g. a field typed as `date` contains text), the error is shown inline rather than a blank page.
- **Pivot** — Interactive pivot/crosstab powered by [Perspective](https://perspective.finos.org/) (`@perspective-dev` v4.4.0, loaded from CDN at runtime). Loads all rows from the source view into an in-browser Perspective worker and renders a `<perspective-viewer>` web component. Supports grouping, splitting, filtering, sorting, and charting interactively.
**Toolbar (above the viewer):**
- Named layouts — saved per source in the `pivot_layouts` DB table. Each chip recalls the full viewer state including group_by, split_by, filters, expressions, selection mode, and expand depth. A blue **Save** button overwrites the active layout in place; **+ Save as…** saves to a new name. The × on each chip deletes it.
- **depth: 0 1 2 3** — collapses or expands all grouped rows to the specified hierarchy level. Implemented via `view.set_depth(d)` + `plugin.draw(view)` (the only working mechanism found in v4.4.0 — `plugin_config.expand_depth` and `viewer.flush()` alone have no effect).
- The Perspective built-in **selection mode button** (Read-Only / Select Row / Select Column / Select Region) defaults to **Select Region** on fresh load, set directly via `plugin.restore({ edit_mode: 'SELECT_REGION' })` after the viewer loads.
**Cell inspector (right panel):**
- Opens when a cell is clicked and a `group_by` hierarchy is active. If there is no `group_by`, the click is ignored — without coordinate filters the query would return the full dataset.
- Row filtering uses a temporary Perspective view (`table.view({ filter: eventFilters, expressions: config.expressions })`) so that computed/expression columns in `split_by` are evaluated correctly. Falls back to JS-side filtering if the view query fails.
- Shows cell coordinates (group_by split_by values), the clicked metric with value, any user-set filters, and a table of matching raw rows.
- Number formatting rounds to 2 decimal places by default; a /+ control in the inspector header adjusts precision (08).
**Layout persistence:**
- `localStorage` key `psp_layout_{source}` saves the last viewer state on each named layout save.
- Named layouts store `{ ...viewer.save(), plugin_config: plugin.save(), expand_depth }` as JSONB in `pivot_layouts`. On recall, viewer config, plugin config (edit mode), and expand depth are all restored independently.
See `docs/perspective-pivot.md` for the full technical reference on controlling Perspective programmatically.
- **Log** — Global import log across all sources. Same expandable key detail and delete capability as the Import page, plus a source name column.
---

View File

@ -39,6 +39,59 @@ module.exports = (pool) => {
}
});
// Get global output values (for autocomplete across all global_picklist=true sources)
router.get('/global-values', async (req, res, next) => {
try {
const result = await pool.query(`SELECT * FROM get_global_output_values()`);
const map = {};
for (const { col, val } of result.rows) {
if (!map[col]) map[col] = [];
map[col].push(val);
}
res.json(map);
} catch (err) {
next(err);
}
});
// Search output field values across all mappings (for global remap)
router.get('/outputs', async (req, res, next) => {
try {
const { search = '' } = req.query;
const result = await pool.query(`SELECT * FROM search_mapping_outputs(${lit(search)})`);
res.json(result.rows);
} catch (err) {
next(err);
}
});
// Get individual mappings for a specific output field value
router.get('/outputs/:col/:val', async (req, res, next) => {
try {
const result = await pool.query(
`SELECT * FROM get_mappings_by_output_field(${lit(req.params.col)}, ${lit(req.params.val)})`
);
res.json(result.rows);
} catch (err) {
next(err);
}
});
// Remap a field value globally across all mappings
router.post('/remap-field', async (req, res, next) => {
try {
const { col, from_val, to_val } = req.body;
if (!col || from_val == null || to_val == null)
return res.status(400).json({ error: 'col, from_val, and to_val are required' });
const result = await pool.query(
`SELECT remap_output_field(${lit(col)}, ${lit(from_val)}, ${lit(to_val)}) AS updated`
);
res.json({ updated: result.rows[0].updated });
} catch (err) {
next(err);
}
});
// Get unmapped values
router.get('/source/:source_name/unmapped', async (req, res, next) => {
try {

View File

@ -73,7 +73,9 @@ module.exports = (pool) => {
const result = await pool.query(
`SELECT * FROM create_rule(${lit(source_name)}, ${lit(name)}, ${lit(field)}, ${lit(pattern)}, ${lit(output_field)}, ${lit(function_type || 'extract')}, ${lit(flags || '')}, ${lit(replace_value || '')}, ${lit(enabled !== false)}, ${lit(retain === true)}, ${lit(sequence || 0)})`
);
res.status(201).json(result.rows[0]);
const rule = result.rows[0];
await pool.query(`SELECT reprocess_records(${lit(source_name)})`);
res.status(201).json(rule);
} catch (err) {
if (err.code === '23505') return res.status(409).json({ error: 'Rule already exists for this source' });
if (err.code === '23503') return res.status(404).json({ error: 'Source not found' });
@ -93,7 +95,9 @@ module.exports = (pool) => {
`SELECT * FROM update_rule(${lit(parseInt(req.params.id))}, ${n(name)}, ${n(field)}, ${n(pattern)}, ${n(output_field)}, ${n(function_type)}, ${n(flags)}, ${n(replace_value)}, ${n(enabled)}, ${n(retain)}, ${n(sequence)})`
);
if (result.rows.length === 0) return res.status(404).json({ error: 'Rule not found' });
res.json(result.rows[0]);
const rule = result.rows[0];
await pool.query(`SELECT reprocess_records(${lit(rule.source_name)})`);
res.json(rule);
} catch (err) {
next(err);
}

View File

@ -52,19 +52,22 @@ module.exports = (pool) => {
const records = parse(req.file.buffer, { columns: true, skip_empty_lines: true, trim: true });
if (records.length === 0) return res.status(400).json({ error: 'CSV file is empty' });
const ISO_DATE_RE = /^\d{4}-\d{2}-\d{2}(T[\d:.Z+-]+)?$/;
const sample = records[0];
const sampleRows = records.slice(0, 50);
const fields = Object.keys(sample).map(key => {
const val = sample[key];
const vals = sampleRows.map(r => r[key]).filter(v => v !== '' && v != null);
let type = 'text';
if (!isNaN(parseFloat(val)) && isFinite(val) && val.charAt(0) !== '0') {
if (vals.length > 0 && vals.every(v => !isNaN(parseFloat(v)) && isFinite(v) && String(v).charAt(0) !== '0')) {
type = 'numeric';
} else if (Date.parse(val) > Date.parse('1950-01-01') && Date.parse(val) < Date.parse('2050-01-01')) {
} else if (vals.length > 0 && vals.every(v => ISO_DATE_RE.test(String(v)))) {
type = 'date';
}
return { name: key, type };
});
res.json({ name: '', dedup_fields: [], fields });
res.json({ name: '', constraint_fields: [], fields, sampleRows });
} catch (err) {
next(err);
}
@ -73,12 +76,12 @@ module.exports = (pool) => {
// Create source
router.post('/', async (req, res, next) => {
try {
const { name, dedup_fields, config } = req.body;
if (!name || !dedup_fields || !Array.isArray(dedup_fields)) {
return res.status(400).json({ error: 'Missing required fields: name, dedup_fields (array)' });
const { name, constraint_fields, config, global_picklist } = req.body;
if (!name || !constraint_fields || !Array.isArray(constraint_fields)) {
return res.status(400).json({ error: 'Missing required fields: name, constraint_fields (array)' });
}
const result = await pool.query(
`SELECT * FROM create_source(${lit(name)}, ${arr(dedup_fields)}, ${lit(config || {})})`
`SELECT * FROM create_source(${lit(name)}, ${arr(constraint_fields)}, ${lit(config || {})}, ${lit(global_picklist !== false)})`
);
res.status(201).json(result.rows[0]);
} catch (err) {
@ -90,9 +93,10 @@ module.exports = (pool) => {
// Update source
router.put('/:name', async (req, res, next) => {
try {
const { dedup_fields, config } = req.body;
const { constraint_fields, config, global_picklist } = req.body;
const gpVal = global_picklist !== undefined ? lit(global_picklist) : 'NULL';
const result = await pool.query(
`SELECT * FROM update_source(${lit(req.params.name)}, ${dedup_fields ? arr(dedup_fields) : 'NULL'}, ${config ? lit(config) : 'NULL'})`
`SELECT * FROM update_source(${lit(req.params.name)}, ${constraint_fields ? arr(constraint_fields) : 'NULL'}, ${config ? lit(config) : 'NULL'}, ${gpVal})`
);
if (result.rows.length === 0) return res.status(404).json({ error: 'Source not found' });
res.json(result.rows[0]);
@ -122,6 +126,8 @@ module.exports = (pool) => {
);
const importData = importResult.rows[0].result;
if (!importData.success) return res.json(importData);
const transformResult = await pool.query(
`SELECT apply_transformations(${lit(req.params.name)}) as result`
);
@ -210,9 +216,13 @@ module.exports = (pool) => {
// Get view data (paginated, sortable)
router.get('/:name/view-data', async (req, res, next) => {
try {
const { limit = 100, offset = 0, sort_col, sort_dir } = req.query;
const { limit = 100, offset = 0, sort_col, sort_dir, filters } = req.query;
let parsedFilters = null;
if (filters) {
try { parsedFilters = JSON.parse(filters); } catch { /* ignore bad JSON */ }
}
const result = await pool.query(
`SELECT get_view_data(${lit(req.params.name)}, ${lit(parseInt(limit))}, ${lit(parseInt(offset))}, ${lit(sort_col || null)}, ${lit(sort_dir || 'asc')}) as result`
`SELECT get_view_data(${lit(req.params.name)}, ${lit(parseInt(limit))}, ${lit(parseInt(offset))}, ${lit(sort_col || null)}, ${lit(sort_dir || 'asc')}, ${parsedFilters ? lit(parsedFilters) : 'NULL'}) as result`
);
res.json(result.rows[0].result);
} catch (err) {
@ -220,5 +230,32 @@ module.exports = (pool) => {
}
});
// Pivot layouts
router.get('/:name/layouts', async (req, res, next) => {
try {
const result = await pool.query(`SELECT * FROM list_pivot_layouts(${lit(req.params.name)})`);
res.json(result.rows);
} catch (err) { next(err); }
});
router.post('/:name/layouts', async (req, res, next) => {
try {
const { layout_name, config } = req.body;
if (!layout_name || !config) return res.status(400).json({ error: 'layout_name and config required' });
const result = await pool.query(
`SELECT * FROM save_pivot_layout(${lit(req.params.name)}, ${lit(layout_name)}, ${lit(config)})`
);
res.json(result.rows[0]);
} catch (err) { next(err); }
});
router.delete('/:name/layouts/:id', async (req, res, next) => {
try {
const result = await pool.query(`SELECT * FROM delete_pivot_layout(${lit(parseInt(req.params.id))})`);
if (result.rows.length === 0) return res.status(404).json({ error: 'Layout not found' });
res.json({ success: true });
} catch (err) { next(err); }
});
return router;
};

View File

@ -14,17 +14,16 @@ CREATE OR REPLACE FUNCTION import_records(
p_data JSONB -- Array of records
) RETURNS JSON AS $$
DECLARE
v_dedup_fields TEXT[];
v_inserted INTEGER;
v_duplicates INTEGER;
v_log_id INTEGER;
v_constraint_fields TEXT[];
v_inserted INTEGER;
v_duplicates INTEGER;
v_log_id INTEGER;
BEGIN
-- Get dedup fields for this source
SELECT dedup_fields INTO v_dedup_fields
SELECT constraint_fields INTO v_constraint_fields
FROM dataflow.sources
WHERE name = p_source_name;
IF v_dedup_fields IS NULL THEN
IF v_constraint_fields IS NULL THEN
RETURN json_build_object(
'success', false,
'error', 'Source not found: ' || p_source_name
@ -32,52 +31,49 @@ BEGIN
END IF;
WITH
-- All incoming records with their dedup keys and readable field values
-- All incoming records with their constraint keys
pending AS (
SELECT
rec.value AS data,
rec.value AS data,
rec.ordinality AS seq,
dataflow.generate_dedup_key(rec.value, v_dedup_fields) AS dedup_key,
(SELECT jsonb_object_agg(f, rec.value->>f)
FROM unnest(v_dedup_fields) AS f) AS dedup_values
FROM unnest(v_constraint_fields) AS f) AS constraint_key
FROM jsonb_array_elements(p_data) WITH ORDINALITY AS rec
),
-- Keys already in the database (excluded) with their readable values
-- Keys already in the database (excluded)
existing AS (
SELECT DISTINCT ON (r.dedup_key) r.dedup_key,
(SELECT jsonb_object_agg(f, r.data->>f)
FROM unnest(v_dedup_fields) AS f) AS dedup_values
SELECT DISTINCT r.constraint_key
FROM dataflow.records r
INNER JOIN pending p ON p.dedup_key = r.dedup_key
INNER JOIN pending p ON p.constraint_key = r.constraint_key
WHERE r.source_name = p_source_name
),
-- Keys that are new
new_keys AS (
SELECT p.dedup_key, p.dedup_values FROM pending p
WHERE NOT EXISTS (SELECT 1 FROM existing e WHERE e.dedup_key = p.dedup_key)
-- Rows whose constraint key is not yet in the database
new_records AS (
SELECT p.data, p.constraint_key, p.seq
FROM pending p
WHERE NOT EXISTS (SELECT 1 FROM existing e WHERE e.constraint_key = p.constraint_key)
),
-- Write the log entry with readable field values instead of hashes
-- Write the log entry
log_entry AS (
INSERT INTO dataflow.import_log (source_name, records_imported, records_duplicate, info)
VALUES (
p_source_name,
(SELECT count(*) FROM new_keys),
(SELECT count(*) FROM existing),
(SELECT count(*) FROM new_records),
(SELECT count(*) FROM pending) - (SELECT count(*) FROM new_records),
jsonb_build_object(
'total', jsonb_array_length(p_data),
'inserted_keys', (SELECT jsonb_agg(dedup_values) FROM new_keys),
'excluded_keys', (SELECT jsonb_agg(dedup_values) FROM existing)
'inserted_keys', (SELECT jsonb_agg(constraint_key ORDER BY constraint_key) FROM new_records),
'excluded_keys', (SELECT jsonb_agg(constraint_key) FROM existing)
)
)
RETURNING id, records_imported, records_duplicate
),
-- Insert only new records
-- Insert new records
inserted AS (
INSERT INTO dataflow.records (source_name, data, dedup_key, import_id)
SELECT p_source_name, p.data, p.dedup_key, (SELECT id FROM log_entry)
FROM pending p
INNER JOIN new_keys nk ON nk.dedup_key = p.dedup_key
ORDER BY p.seq
INSERT INTO dataflow.records (source_name, data, constraint_key, import_id)
SELECT p_source_name, nr.data, nr.constraint_key, (SELECT id FROM log_entry)
FROM new_records nr
ORDER BY nr.seq
RETURNING id
)
SELECT le.id, le.records_imported, le.records_duplicate

View File

@ -19,14 +19,14 @@ CREATE EXTENSION IF NOT EXISTS dblink;
\echo ''
\echo '=== 1. Sources ==='
INSERT INTO dataflow.sources (name, dedup_fields, config)
INSERT INTO dataflow.sources (name, constraint_fields, config)
SELECT
srce AS name,
-- Strip {} wrappers from constraint paths → dedup field names
-- Strip {} wrappers from constraint paths → constraint field names
ARRAY(
SELECT regexp_replace(c, '^\{|\}$', '', 'g')
FROM jsonb_array_elements_text(defn->'constraint') AS c
) AS dedup_fields,
) AS constraint_fields,
-- Build config.fields from the first schema (index 0 = "mapped" for dcard, "default" for others)
jsonb_build_object('fields',
(SELECT jsonb_agg(
@ -44,7 +44,7 @@ FROM dblink(:'tps_conn',
) AS t(srce TEXT, defn JSONB)
ON CONFLICT (name) DO NOTHING;
SELECT name, dedup_fields, jsonb_array_length(config->'fields') AS field_count
SELECT name, constraint_fields, jsonb_array_length(config->'fields') AS field_count
FROM dataflow.sources ORDER BY name;
\echo ''
@ -95,11 +95,11 @@ FROM dataflow.mappings GROUP BY source_name, rule_name ORDER BY source_name, rul
\echo '=== 4. Records ==='
\echo ' (13 000+ rows — may take a moment)'
INSERT INTO dataflow.records (source_name, data, dedup_key, transformed, imported_at, transformed_at)
INSERT INTO dataflow.records (source_name, data, constraint_key, transformed, imported_at, transformed_at)
SELECT
t.srce AS source_name,
t.rec AS data,
dataflow.generate_dedup_key(t.rec, s.dedup_fields) AS dedup_key,
(SELECT jsonb_object_agg(f, t.rec->>f) FROM unnest(s.constraint_fields) AS f) AS constraint_key,
t.allj AS transformed,
CURRENT_TIMESTAMP AS imported_at,
CASE WHEN t.allj IS NOT NULL THEN CURRENT_TIMESTAMP END AS transformed_at
@ -107,7 +107,7 @@ FROM dblink(:'tps_conn',
'SELECT srce, rec, allj FROM tps.trans'
) AS t(srce TEXT, rec JSONB, allj JSONB)
JOIN dataflow.sources s ON s.name = t.srce
ON CONFLICT (source_name, dedup_key) DO NOTHING;
ON CONFLICT (source_name, constraint_key) DO NOTHING;
SELECT source_name, COUNT(*) AS records, COUNT(transformed) AS transformed
FROM dataflow.records GROUP BY source_name ORDER BY source_name;

View File

@ -206,3 +206,56 @@ BEGIN
ORDER BY count(*) DESC;
END;
$$ LANGUAGE plpgsql;
-- ── Global picklist ───────────────────────────────────────────────────────────
CREATE OR REPLACE FUNCTION get_global_output_values()
RETURNS TABLE (col TEXT, val TEXT) AS $$
SELECT DISTINCT e.key AS col, e.value AS val
FROM dataflow.mappings m
JOIN dataflow.sources s ON s.name = m.source_name
CROSS JOIN LATERAL jsonb_each_text(m.output) AS e(key, value)
WHERE s.global_picklist = true
AND e.value IS NOT NULL
AND e.value <> ''
ORDER BY e.key, e.value;
$$ LANGUAGE sql STABLE;
-- ── Remap output field values ─────────────────────────────────────────────────
-- Search for distinct (field, value) pairs across all mapping outputs
CREATE OR REPLACE FUNCTION search_mapping_outputs(p_search TEXT)
RETURNS TABLE (col TEXT, val TEXT, mapping_count BIGINT) AS $$
SELECT e.key AS col, e.value AS val, COUNT(*) AS mapping_count
FROM dataflow.mappings m
CROSS JOIN LATERAL jsonb_each_text(m.output) AS e(key, value)
WHERE e.value ILIKE '%' || p_search || '%'
AND e.value IS NOT NULL
AND e.value <> ''
GROUP BY e.key, e.value
ORDER BY e.key, e.value;
$$ LANGUAGE sql STABLE;
-- Get individual mappings matching a specific output field value
CREATE OR REPLACE FUNCTION get_mappings_by_output_field(p_col TEXT, p_val TEXT)
RETURNS TABLE (id INT, source_name TEXT, rule_name TEXT, input_value JSONB, output JSONB) AS $$
SELECT m.id, m.source_name, m.rule_name, m.input_value, m.output
FROM dataflow.mappings m
WHERE m.output->>(p_col) = p_val
ORDER BY m.source_name, m.rule_name, m.input_value::text;
$$ LANGUAGE sql STABLE;
-- Replace a specific field value across all matching mappings
CREATE OR REPLACE FUNCTION remap_output_field(p_col TEXT, p_from_val TEXT, p_to_val TEXT)
RETURNS INTEGER AS $$
DECLARE
updated_count INTEGER;
BEGIN
UPDATE dataflow.mappings
SET output = jsonb_set(output, ARRAY[p_col], to_jsonb(p_to_val))
WHERE output->>(p_col) = p_from_val;
GET DIAGNOSTICS updated_count = ROW_COUNT;
RETURN updated_count;
END;
$$ LANGUAGE plpgsql;

View File

@ -85,7 +85,7 @@ CREATE OR REPLACE FUNCTION preview_rule(
p_replace_value TEXT DEFAULT '',
p_limit INT DEFAULT 20
)
RETURNS TABLE (id BIGINT, raw_value TEXT, extracted_value JSONB) AS $$
RETURNS TABLE (id INT, raw_value TEXT, extracted_value JSONB) AS $$
BEGIN
IF p_function_type = 'replace' THEN
RETURN QUERY

View File

@ -17,19 +17,20 @@ RETURNS dataflow.sources AS $$
SELECT * FROM dataflow.sources WHERE name = p_name;
$$ LANGUAGE sql STABLE;
CREATE OR REPLACE FUNCTION create_source(p_name TEXT, p_dedup_fields TEXT[], p_config JSONB DEFAULT '{}')
CREATE OR REPLACE FUNCTION create_source(p_name TEXT, p_constraint_fields TEXT[], p_config JSONB DEFAULT '{}', p_global_picklist BOOLEAN DEFAULT true)
RETURNS dataflow.sources AS $$
INSERT INTO dataflow.sources (name, dedup_fields, config)
VALUES (p_name, p_dedup_fields, p_config)
INSERT INTO dataflow.sources (name, constraint_fields, config, global_picklist)
VALUES (p_name, p_constraint_fields, p_config, p_global_picklist)
RETURNING *;
$$ LANGUAGE sql;
CREATE OR REPLACE FUNCTION update_source(p_name TEXT, p_dedup_fields TEXT[] DEFAULT NULL, p_config JSONB DEFAULT NULL)
CREATE OR REPLACE FUNCTION update_source(p_name TEXT, p_constraint_fields TEXT[] DEFAULT NULL, p_config JSONB DEFAULT NULL, p_global_picklist BOOLEAN DEFAULT NULL)
RETURNS dataflow.sources AS $$
UPDATE dataflow.sources
SET dedup_fields = COALESCE(p_dedup_fields, dedup_fields),
config = COALESCE(p_config, config),
updated_at = CURRENT_TIMESTAMP
SET constraint_fields = COALESCE(p_constraint_fields, constraint_fields),
config = COALESCE(p_config, config),
global_picklist = COALESCE(p_global_picklist, global_picklist),
updated_at = CURRENT_TIMESTAMP
WHERE name = p_name
RETURNING *;
$$ LANGUAGE sql;
@ -41,13 +42,6 @@ $$ LANGUAGE sql;
-- ── Import log ────────────────────────────────────────────────────────────────
CREATE OR REPLACE FUNCTION get_import_log(p_source_name TEXT)
RETURNS SETOF dataflow.import_log AS $$
SELECT * FROM dataflow.import_log
WHERE source_name = p_source_name
ORDER BY imported_at DESC;
$$ LANGUAGE sql STABLE;
-- ── Stats ─────────────────────────────────────────────────────────────────────
CREATE OR REPLACE FUNCTION get_source_stats(p_source_name TEXT)
@ -87,16 +81,21 @@ $$ LANGUAGE sql STABLE;
CREATE OR REPLACE FUNCTION get_view_data(
p_source_name TEXT,
p_limit INT DEFAULT 100,
p_offset INT DEFAULT 0,
p_sort_col TEXT DEFAULT NULL,
p_sort_dir TEXT DEFAULT 'asc'
p_limit INT DEFAULT 100,
p_offset INT DEFAULT 0,
p_sort_col TEXT DEFAULT NULL,
p_sort_dir TEXT DEFAULT 'asc',
p_filters JSONB DEFAULT NULL -- [{col, pattern}, ...] — postgres regex (~*)
)
RETURNS JSON AS $$
DECLARE
v_exists BOOLEAN;
v_order TEXT := '';
v_rows JSON;
v_exists BOOLEAN;
v_where TEXT := '';
v_order TEXT := '';
v_rows JSON;
v_filter JSONB;
v_col TEXT;
v_pattern TEXT;
BEGIN
SELECT EXISTS (
SELECT 1 FROM information_schema.views
@ -107,6 +106,24 @@ BEGIN
RETURN json_build_object('exists', FALSE, 'rows', '[]'::json);
END IF;
-- Build WHERE from filters (validate each column exists in the view)
IF p_filters IS NOT NULL THEN
FOR v_filter IN SELECT value FROM jsonb_array_elements(p_filters) LOOP
v_col := v_filter->>'col';
v_pattern := v_filter->>'pattern';
IF v_pattern IS NOT NULL AND v_pattern <> '' AND EXISTS (
SELECT 1 FROM information_schema.columns
WHERE table_schema = 'dfv'
AND table_name = p_source_name
AND column_name = v_col
) THEN
v_where := v_where ||
CASE WHEN v_where = '' THEN ' WHERE ' ELSE ' AND ' END ||
quote_ident(v_col) || '::text ~* ' || quote_literal(v_pattern);
END IF;
END LOOP;
END IF;
IF p_sort_col IS NOT NULL AND EXISTS (
SELECT 1 FROM information_schema.columns
WHERE table_schema = 'dfv'
@ -118,156 +135,15 @@ BEGIN
|| ' NULLS LAST';
END IF;
-- Subquery applies ORDER BY + LIMIT first, then json_agg collects in that order.
-- json_agg on the outer query preserves column order (json not jsonb).
EXECUTE format(
'SELECT COALESCE(json_agg(row_to_json(t)), ''[]''::json) FROM (SELECT * FROM dfv.%I%s LIMIT %s OFFSET %s) t',
p_source_name, v_order, p_limit, p_offset
'SELECT COALESCE(json_agg(row_to_json(t)), ''[]''::json) FROM (SELECT * FROM dfv.%I%s%s LIMIT %s OFFSET %s) t',
p_source_name, v_where, v_order, p_limit, p_offset
) INTO v_rows;
RETURN json_build_object('exists', TRUE, 'rows', v_rows);
END;
$$ LANGUAGE plpgsql STABLE;
-- ── Import (deduplication) ────────────────────────────────────────────────────
CREATE OR REPLACE FUNCTION import_records(p_source_name TEXT, p_data JSONB)
RETURNS JSON AS $$
DECLARE
v_dedup_fields TEXT[];
v_record JSONB;
v_dedup_key TEXT;
v_inserted INTEGER := 0;
v_duplicates INTEGER := 0;
v_log_id INTEGER;
BEGIN
SELECT dedup_fields INTO v_dedup_fields
FROM dataflow.sources WHERE name = p_source_name;
IF v_dedup_fields IS NULL THEN
RETURN json_build_object('success', false, 'error', 'Source not found: ' || p_source_name);
END IF;
FOR v_record IN SELECT * FROM jsonb_array_elements(p_data) LOOP
v_dedup_key := dataflow.generate_dedup_key(v_record, v_dedup_fields);
BEGIN
INSERT INTO dataflow.records (source_name, data, dedup_key)
VALUES (p_source_name, v_record, v_dedup_key);
v_inserted := v_inserted + 1;
EXCEPTION WHEN unique_violation THEN
v_duplicates := v_duplicates + 1;
END;
END LOOP;
INSERT INTO dataflow.import_log (source_name, records_imported, records_duplicate)
VALUES (p_source_name, v_inserted, v_duplicates)
RETURNING id INTO v_log_id;
RETURN json_build_object('success', true, 'imported', v_inserted, 'duplicates', v_duplicates, 'log_id', v_log_id);
END;
$$ LANGUAGE plpgsql;
-- ── Transformations ───────────────────────────────────────────────────────────
CREATE OR REPLACE FUNCTION dataflow.jsonb_merge(a JSONB, b JSONB)
RETURNS JSONB AS $$
SELECT COALESCE(a, '{}') || COALESCE(b, '{}')
$$ LANGUAGE sql IMMUTABLE;
DROP AGGREGATE IF EXISTS dataflow.jsonb_concat_obj(JSONB);
CREATE AGGREGATE dataflow.jsonb_concat_obj(JSONB) (
sfunc = dataflow.jsonb_merge,
stype = JSONB,
initcond = '{}'
);
DROP FUNCTION IF EXISTS apply_transformations(TEXT, INTEGER[]);
CREATE OR REPLACE FUNCTION apply_transformations(
p_source_name TEXT,
p_record_ids INTEGER[] DEFAULT NULL,
p_overwrite BOOLEAN DEFAULT FALSE
) RETURNS JSON AS $$
WITH
qualifying AS (
SELECT id, data FROM dataflow.records
WHERE source_name = p_source_name
AND (p_overwrite OR transformed IS NULL)
AND (p_record_ids IS NULL OR id = ANY(p_record_ids))
),
rx AS (
SELECT
q.id,
r.name AS rule_name,
r.sequence,
r.output_field,
r.retain,
r.function_type,
COALESCE(mt.rn, rp.rn, 1) AS result_number,
CASE WHEN array_length(mt.mt, 1) = 1 THEN to_jsonb(mt.mt[1]) ELSE to_jsonb(mt.mt) END AS match_val,
to_jsonb(rp.rp) AS replace_val
FROM dataflow.rules r
INNER JOIN qualifying q ON q.data ? r.field
LEFT JOIN LATERAL regexp_matches(q.data ->> r.field, r.pattern, r.flags)
WITH ORDINALITY AS mt(mt, rn) ON r.function_type = 'extract'
LEFT JOIN LATERAL regexp_replace(q.data ->> r.field, r.pattern, r.replace_value, r.flags)
WITH ORDINALITY AS rp(rp, rn) ON r.function_type = 'replace'
WHERE r.source_name = p_source_name AND r.enabled = true
),
agg_matches AS (
SELECT
id, rule_name, sequence, output_field, retain, function_type,
CASE function_type
WHEN 'replace' THEN jsonb_agg(replace_val) -> 0
ELSE
CASE WHEN max(result_number) = 1
THEN jsonb_agg(match_val ORDER BY result_number) -> 0
ELSE jsonb_agg(match_val ORDER BY result_number)
END
END AS extracted
FROM rx
GROUP BY id, rule_name, sequence, output_field, retain, function_type
),
linked AS (
SELECT
a.id, a.sequence, a.output_field, a.retain, a.extracted, m.output AS mapped
FROM agg_matches a
LEFT JOIN dataflow.mappings m ON
m.source_name = p_source_name
AND m.rule_name = a.rule_name
AND m.input_value = a.extracted
WHERE a.extracted IS NOT NULL
),
rule_output AS (
SELECT
id, sequence,
CASE
WHEN mapped IS NOT NULL THEN
mapped || CASE WHEN retain THEN jsonb_build_object(output_field, extracted) ELSE '{}'::jsonb END
ELSE jsonb_build_object(output_field, extracted)
END AS output
FROM linked
),
record_additions AS (
SELECT id, dataflow.jsonb_concat_obj(output ORDER BY sequence) AS additions
FROM rule_output GROUP BY id
),
updated AS (
UPDATE dataflow.records rec
SET transformed = rec.data || COALESCE(ra.additions, '{}'::jsonb),
transformed_at = CURRENT_TIMESTAMP
FROM qualifying q
LEFT JOIN record_additions ra ON ra.id = q.id
WHERE rec.id = q.id
RETURNING rec.id
)
SELECT json_build_object('success', true, 'transformed', count(*)) FROM updated
$$ LANGUAGE sql;
CREATE OR REPLACE FUNCTION reprocess_records(p_source_name TEXT)
RETURNS JSON AS $$
SELECT dataflow.apply_transformations(p_source_name, NULL, TRUE)
$$ LANGUAGE sql;
-- ── View generation ───────────────────────────────────────────────────────────
CREATE OR REPLACE FUNCTION generate_source_view(p_source_name TEXT)
@ -320,3 +196,28 @@ BEGIN
RETURN json_build_object('success', true, 'view', v_view, 'sql', v_sql);
END;
$$ LANGUAGE plpgsql;
-- List saved pivot layouts for a source
CREATE OR REPLACE FUNCTION list_pivot_layouts(p_source_name TEXT)
RETURNS TABLE(id INT, source_name TEXT, layout_name TEXT, config JSONB, created_at TIMESTAMPTZ) AS $$
SELECT id, source_name, layout_name, config, created_at
FROM dataflow.pivot_layouts
WHERE source_name = p_source_name
ORDER BY layout_name;
$$ LANGUAGE sql;
-- Save (upsert) a named pivot layout
CREATE OR REPLACE FUNCTION save_pivot_layout(p_source_name TEXT, p_layout_name TEXT, p_config JSONB)
RETURNS TABLE(id INT, source_name TEXT, layout_name TEXT, config JSONB, created_at TIMESTAMPTZ) AS $$
INSERT INTO dataflow.pivot_layouts (source_name, layout_name, config)
VALUES (p_source_name, p_layout_name, p_config)
ON CONFLICT (source_name, layout_name) DO UPDATE
SET config = EXCLUDED.config
RETURNING id, source_name, layout_name, config, created_at;
$$ LANGUAGE sql;
-- Delete a named pivot layout
CREATE OR REPLACE FUNCTION delete_pivot_layout(p_id INT)
RETURNS TABLE(id INT) AS $$
DELETE FROM dataflow.pivot_layouts WHERE id = p_id RETURNING id;
$$ LANGUAGE sql;

View File

@ -15,14 +15,15 @@ SET search_path TO dataflow, public;
------------------------------------------------------
CREATE TABLE sources (
name TEXT PRIMARY KEY,
dedup_fields TEXT[] NOT NULL, -- Fields used for deduplication (e.g., ['date', 'amount', 'description'])
constraint_fields TEXT[] NOT NULL, -- Fields that uniquely identify a record (e.g., ['date', 'amount', 'description'])
config JSONB DEFAULT '{}'::jsonb,
global_picklist BOOLEAN NOT NULL DEFAULT true, -- Contribute output values to global autocomplete suggestions
created_at TIMESTAMPTZ DEFAULT CURRENT_TIMESTAMP,
updated_at TIMESTAMPTZ DEFAULT CURRENT_TIMESTAMP
);
COMMENT ON TABLE sources IS 'Data source definitions';
COMMENT ON COLUMN sources.dedup_fields IS 'Array of field names used to identify duplicate records';
COMMENT ON COLUMN sources.constraint_fields IS 'Array of field names that uniquely identify a record';
COMMENT ON COLUMN sources.config IS 'Additional source configuration (optional)';
------------------------------------------------------
@ -35,7 +36,7 @@ CREATE TABLE records (
-- Data
data JSONB NOT NULL, -- Original imported data
dedup_key TEXT NOT NULL, -- Hash of dedup fields for fast lookup
constraint_key JSONB, -- Fields that uniquely identify this record (set on import)
transformed JSONB, -- Data after transformations applied
-- Metadata
@ -43,18 +44,17 @@ CREATE TABLE records (
imported_at TIMESTAMPTZ DEFAULT CURRENT_TIMESTAMP,
transformed_at TIMESTAMPTZ,
-- Constraints
UNIQUE(source_name, dedup_key) -- Prevent duplicates
);
COMMENT ON TABLE records IS 'Imported records with raw and transformed data';
COMMENT ON COLUMN records.data IS 'Original data as imported';
COMMENT ON COLUMN records.dedup_key IS 'Hash of deduplication fields for fast duplicate detection';
COMMENT ON COLUMN records.constraint_key IS 'JSONB object of constraint field values — uniquely identifies this record within its source';
COMMENT ON COLUMN records.transformed IS 'Data after applying transformation rules';
-- Indexes
CREATE INDEX idx_records_source ON records(source_name);
CREATE INDEX idx_records_dedup ON records(source_name, dedup_key);
CREATE INDEX idx_records_constraint ON records USING gin(constraint_key);
CREATE INDEX idx_records_data ON records USING gin(data);
CREATE INDEX idx_records_transformed ON records USING gin(transformed);
@ -139,33 +139,22 @@ COMMENT ON COLUMN import_log.info IS 'Import details: inserted_keys and excluded
CREATE INDEX idx_import_log_source ON import_log(source_name);
CREATE INDEX idx_import_log_timestamp ON import_log(imported_at);
------------------------------------------------------
-- Helper function: Generate dedup key
------------------------------------------------------
CREATE OR REPLACE FUNCTION generate_dedup_key(
data JSONB,
dedup_fields TEXT[]
) RETURNS TEXT AS $$
DECLARE
field TEXT;
values TEXT := '';
BEGIN
-- Concatenate values from dedup fields
FOREACH field IN ARRAY dedup_fields LOOP
values := values || COALESCE(data->>field, '') || '|';
END LOOP;
-- Return MD5 hash of concatenated values
RETURN md5(values);
END;
$$ LANGUAGE plpgsql IMMUTABLE;
CREATE TABLE pivot_layouts (
id SERIAL PRIMARY KEY,
source_name TEXT NOT NULL REFERENCES sources(name) ON DELETE CASCADE,
layout_name TEXT NOT NULL,
config JSONB NOT NULL,
created_at TIMESTAMPTZ DEFAULT CURRENT_TIMESTAMP,
UNIQUE (source_name, layout_name)
);
COMMENT ON FUNCTION generate_dedup_key IS 'Generate hash key from specified fields for deduplication';
CREATE INDEX idx_pivot_layouts_source ON pivot_layouts(source_name);
------------------------------------------------------
-- Summary
------------------------------------------------------
-- Tables: 5 (sources, records, rules, mappings, import_log)
-- Tables: 6 (sources, records, rules, mappings, import_log, pivot_layouts)
-- Simple, clear structure
-- JSONB for flexibility
-- Deduplication via hash key

285
docs/perspective-pivot.md Normal file
View File

@ -0,0 +1,285 @@
# Perspective Pivot — Technical Reference
Version tested: `@perspective-dev` v4.4.0 (client, viewer, viewer-datagrid, viewer-d3fc), loaded from CDN.
This document captures everything learned about controlling Perspective programmatically. The official docs are incomplete for some of these APIs — treat this as a ground-truth supplement.
---
## Loading from CDN
```js
const [{ default: perspective }] = await Promise.all([
import('https://cdn.jsdelivr.net/npm/@perspective-dev/client@4.4.0/dist/cdn/perspective.js'),
import('https://cdn.jsdelivr.net/npm/@perspective-dev/viewer@4.4.0/dist/cdn/perspective-viewer.js'),
import('https://cdn.jsdelivr.net/npm/@perspective-dev/viewer-datagrid@4.4.0/dist/cdn/perspective-viewer-datagrid.js'),
import('https://cdn.jsdelivr.net/npm/@perspective-dev/viewer-d3fc@4.4.0/dist/cdn/perspective-viewer-d3fc.js'),
])
```
Stylesheet:
```html
<link rel="stylesheet" crossorigin="anonymous"
href="https://cdn.jsdelivr.net/npm/@perspective-dev/viewer/dist/css/themes.css" />
```
---
## Core Objects
```
perspective — the module default export
.worker() — creates a Web Worker instance
worker
.table(rows, opts) — creates a named Table; returns the Table object
.open_table(name) — re-opens a previously created named table
table
.view(config) — creates a View (filtered/grouped projection)
.update(rows) — incremental row upsert/insert
view
.to_json() — returns rows as array of objects
.set_depth(n) — sets expansion depth for all grouped rows (see below)
.delete() — frees the view; always call when done
viewer (the <perspective-viewer> DOM element)
.load(worker) — attaches the worker to the viewer
.save() — returns full viewer config as plain object
.restore(config) — applies a config object to the viewer
.flush() — forces viewer to synchronize (limited effect on plugin state)
.getPlugin() — returns the active plugin element (e.g. datagrid)
.getView() — returns the current View object
.toggleConfig() — shows/hides the settings panel
plugin (datagrid element, from viewer.getPlugin())
.save() — returns plugin-specific state: { columns, scroll_lock, edit_mode }
.restore(config) — applies plugin-specific state
.draw(view) — redraws the plugin against the given View
```
---
## viewer.save() — Config Shape
```js
{
table: "source_name",
plugin: "datagrid", // or "d3_y_bar", etc.
plugin_config: { ... }, // NOT reliably populated — use plugin.save() instead
group_by: ["field1"],
split_by: ["field2"],
columns: ["Amount"],
filter: [["field", "op", "value"]],
sort: [["field", "asc"]],
expressions: { "ExprName": "// formula\n..." },
settings: false, // whether the config panel is open
}
```
**Important:** `plugin_config` in `viewer.save()` is NOT reliably populated in v4.4.0. Use `plugin.save()` separately to capture plugin state.
---
## plugin.save() — Plugin State Shape (datagrid)
```js
{
columns: {}, // per-column formatting overrides
scroll_lock: false,
edit_mode: "SELECT_REGION" // see valid values below
}
```
---
## Selection Modes (edit_mode)
Valid values for the datagrid plugin's `edit_mode` field:
| Value | Button label | Behavior |
|---|---|---|
| `READ_ONLY` | Read-Only | No selection highlight |
| `SELECT_ROW` | Select Row | Highlights full rows |
| `SELECT_COLUMN` | Select Column | Highlights full columns |
| `SELECT_REGION` | Select Region | Highlights clicked cell region |
| `EDIT` | Edit | Enables cell editing |
The built-in button in the viewer toolbar cycles through these in order.
**Setting the default:**
```js
// After viewer.restore(...), set it directly on the plugin:
const plugin = await viewer.getPlugin()
await plugin.restore({ edit_mode: 'SELECT_REGION' })
```
Setting via `viewer.restore({ plugin_config: { edit_mode: ... } })` does NOT reliably work in v4.4.0.
---
## Expand/Collapse Row Depth
Controls how many levels of the `group_by` hierarchy are expanded. This is the only working mechanism found in v4.4.0:
```js
const view = await viewer.getView()
await view.set_depth(depth) // 0 = collapse all, 1 = expand one level, etc.
const plugin = await viewer.getPlugin()
await plugin.draw(view) // required — viewer does not redraw automatically
```
**What does NOT work:**
- `viewer.restore({ plugin_config: { expand_depth: d } })` — silently ignored
- `view.set_depth(d)` alone — view state changes but display doesn't update
- `view.set_depth(d)` + `viewer.flush()` — still no visual update
- `plugin.restore({ expand_depth: d })` — "Unknown" field, ignored
**The `plugin.draw(view)` call is required** to make the datagrid re-render after `set_depth`.
---
## Saving and Restoring Full State
To capture complete state (viewer + plugin + expand depth):
```js
async function captureConfig(viewer, expandDepth) {
const plugin = await viewer.getPlugin()
const [viewerConfig, pluginConfig] = await Promise.all([viewer.save(), plugin.save()])
return { ...viewerConfig, plugin_config: pluginConfig, expand_depth: expandDepth }
}
```
To restore:
```js
async function restoreConfig(viewer, config, applyDepth) {
await viewer.restore(config)
if (config.plugin_config) {
const plugin = await viewer.getPlugin()
await plugin.restore(config.plugin_config)
}
if (config.expand_depth != null) {
await applyDepth(viewer, config.expand_depth)
}
await viewer.flush()
}
async function applyDepth(viewer, depth) {
const view = await viewer.getView()
await view.set_depth(depth)
const plugin = await viewer.getPlugin()
await plugin.draw(view)
}
```
---
## The perspective-click Event
Fires when the user clicks a cell. The event detail:
```js
viewer.addEventListener('perspective-click', async (e) => {
const { row, column_names, config } = e.detail
// row — aggregated values for the clicked cell (keyed by "split|metric" format)
// column_names — array of metric column names clicked
// config — { filter: [[field, op, value], ...] }
// filter includes:
// - group_by coordinate filters (field == value, one per group_by level)
// - split_by coordinate filters (field == value, one per split_by field)
// - user-set filters (any op)
})
```
`__ROW_PATH__` in `row` contains the group_by path as an array.
**The `config.filter` array is the reliable way to get cell coordinates.** Do not try to zip `__ROW_PATH__` with `group_by` — the filter approach handles all cases including partial paths.
---
## Filtering Rows for a Clicked Cell
The click event's `filter` array can be applied to the underlying table via a new View, which correctly evaluates expression/computed columns (unlike filtering raw JS rows):
```js
const config = await viewer.save()
const view = await table.view({
filter: eventFilters,
expressions: config.expressions || [],
})
const rows = await view.to_json()
await view.delete()
// Strip expression columns from results (they're computed, not source fields)
const exprNames = new Set(Object.keys(config.expressions || {}))
const clean = rows.map(r =>
Object.fromEntries(Object.entries(r).filter(([k]) => !exprNames.has(k)))
)
```
**Why not filter raw JS rows?** Expression columns (computed in Perspective) don't exist in the source data. `filterRowsByConfig` on raw rows will skip those filters, returning all rows for the group rather than the specific cell.
**Guard against no group_by:** Without `group_by`, the filter array has no coordinate filters and the view query returns the entire table (slow). Check first:
```js
const config = await viewer.save()
if ((config.group_by || []).length === 0) return // no hierarchy — skip inspector
```
---
## Viewer Methods (full list, v4.4.0)
From `Object.getOwnPropertyNames(Object.getPrototypeOf(viewer))`:
`constructor`, `__destroy_into_raw`, `free`, `__get_model`, `connectedCallback`, `copy`, `delete`, `download`, `eject`, `export`, `flush`, `getAllPlugins`, `getClient`, `getEditPort`, `getPlugin`, `getRenderStats`, `getSelection`, `getTable`, `getView`, `getViewConfig`, `load`, `openColumnSettings`, `reset`, `resetError`, `resetThemes`, `resize`, `restore`, `restyleElement`, `save`, `setAutoPause`, `setAutoSize`, `setSelection`, `setThrottle`, `toggleColumnSettings`, `toggleConfig`
## Plugin Methods (datagrid, full list)
From `Object.getOwnPropertyNames(Object.getPrototypeOf(plugin))`:
`constructor`, `connectedCallback`, `disconnectedCallback`, `activate`, `name`, `category`, `select_mode`, `min_config_columns`, `config_column_names`, `group_rollups`, `priority`, `can_render_column_styles`, `column_style_controls`, `draw`, `update`, `render`, `resize`, `clear`, `save`, `restore`, `restyle`, `delete`
## View Methods (full list)
From `Object.getOwnPropertyNames(Object.getPrototypeOf(view))` — includes `set_depth`, `expand`, `collapse`, `to_json`, `to_csv`, `to_arrow`, `schema`, `num_rows`, `num_columns`, `delete`, and others.
---
## settings Panel
The `settings` key in `viewer.restore()` controls whether the config panel (gear icon) is open:
```js
// Hide on load:
await viewer.restore({ table: "name", settings: false, plugin_config: DEFAULT_PLUGIN_CONFIG })
// Toggle programmatically:
viewer.toggleConfig()
```
The settings state is saved by `viewer.save()` and restored on `viewer.restore()`, so it persists across layout saves automatically.
---
## Incremental Updates
To update the table data without a full reload:
```js
table.update(newRows) // upserts by index (or by index_col if specified at table creation)
```
The viewer re-renders automatically after `table.update()`.
---
## Common Pitfalls
- **`plugin_config` in `viewer.restore()` is unreliable.** Always set plugin state via `plugin.restore()` separately after `viewer.restore()`.
- **`view.set_depth()` requires `plugin.draw(view)`.** The viewer won't redraw automatically.
- **Expression columns don't exist in raw data.** Filter via a Perspective View (`table.view({ filter, expressions })`), not against raw JS rows.
- **Always `await view.delete()`** after using a temporary view, or you'll leak worker memory.
- **Named tables:** `worker.table(rows, { name: 'foo' })` — the name is used by the viewer's `table` config key. Re-open with `worker.open_table('foo')`.

View File

@ -42,7 +42,7 @@ curl -X POST http://localhost:3000/api/sources \
-H "Content-Type: application/json" \
-d '{
"name": "bank_transactions",
"dedup_fields": ["date", "description", "amount"]
"constraint_fields": ["date", "description", "amount"]
}'
```
@ -303,7 +303,7 @@ curl -X POST http://localhost:3000/api/records/search \
**Import fails:**
- Verify source exists: `curl http://localhost:3000/api/sources`
- Check CSV format matches expectations
- Ensure dedup_fields match CSV column names
- Ensure constraint_fields match CSV column names
**Transformations not working:**
- Check rules exist: `curl http://localhost:3000/api/rules/source/bank_transactions`

View File

@ -8,13 +8,17 @@ import Rules from './pages/Rules'
import Mappings from './pages/Mappings'
import Records from './pages/Records'
import Log from './pages/Log'
import Pivot from './pages/Pivot'
import Remap from './pages/Remap'
const NAV = [
{ to: '/sources', label: 'Sources' },
{ to: '/import', label: 'Import' },
{ to: '/rules', label: 'Rules' },
{ to: '/mappings', label: 'Mappings' },
{ to: '/remap', label: 'Remap' },
{ to: '/records', label: 'Records' },
{ to: '/pivot', label: 'Pivot' },
{ to: '/log', label: 'Log' },
]
@ -77,7 +81,7 @@ export default function App() {
<div className="px-3 py-3 border-b border-gray-200">
<div className="flex items-center justify-between mb-1">
<label className="text-xs text-gray-500">Source</label>
<NavLink to="/sources" className="text-xs text-blue-400 hover:text-blue-600 leading-none" title="New source" onClick={() => setSidebarOpen(false)}>+</NavLink>
<NavLink to="/sources?new=1" className="text-xs text-blue-400 hover:text-blue-600 leading-none" title="New source" onClick={() => setSidebarOpen(false)}>+</NavLink>
</div>
<select
className="w-full text-sm border border-gray-200 rounded px-2 py-1 bg-white focus:outline-none focus:border-blue-400"
@ -142,7 +146,9 @@ export default function App() {
<Route path="/import" element={<Import source={source} />} />
<Route path="/rules" element={<Rules source={source} />} />
<Route path="/mappings" element={<Mappings source={source} />} />
<Route path="/remap" element={<Remap />} />
<Route path="/records" element={<Records source={source} />} />
<Route path="/pivot" element={<Pivot source={source} />} />
<Route path="/log" element={<Log />} />
</Routes>
</div>

View File

@ -10,6 +10,11 @@ export function clearCredentials() {
_credentials = null
}
export function authHeaders() {
if (!_credentials) return {}
return { 'Authorization': `Basic ${btoa(`${_credentials.user}:${_credentials.pass}`)}` }
}
async function request(method, path, body, isFormData = false) {
const opts = { method, headers: {} }
@ -65,9 +70,10 @@ export const api = {
reprocess: (name) => request('POST', `/sources/${name}/reprocess`),
generateView: (name) => request('POST', `/sources/${name}/view`),
getFields: (name) => request('GET', `/sources/${name}/fields`),
getViewData: (name, limit = 100, offset = 0, sortCol = null, sortDir = 'asc') => {
getViewData: (name, limit = 100, offset = 0, sortCol = null, sortDir = 'asc', filters = null) => {
const params = new URLSearchParams({ limit, offset })
if (sortCol) { params.set('sort_col', sortCol); params.set('sort_dir', sortDir) }
if (filters && filters.length > 0) params.set('filters', JSON.stringify(filters))
return request('GET', `/sources/${name}/view-data?${params}`)
},
@ -81,6 +87,7 @@ export const api = {
request('GET', `/rules/preview?source=${encodeURIComponent(source)}&field=${encodeURIComponent(field)}&pattern=${encodeURIComponent(pattern)}&flags=${encodeURIComponent(flags || '')}&function_type=${function_type}&replace_value=${encodeURIComponent(replace_value)}&limit=${limit}`),
// Mappings
getGlobalValues: () => request('GET', '/mappings/global-values'),
getMappings: (source, rule) => request('GET', `/mappings/source/${source}${rule ? `?rule_name=${rule}` : ''}`),
getMappingCounts: (source, rule) => request('GET', `/mappings/source/${source}/counts${rule ? `?rule_name=${rule}` : ''}`),
getUnmapped: (source, rule) => request('GET', `/mappings/source/${source}/unmapped${rule ? `?rule_name=${rule}` : ''}`),
@ -96,6 +103,16 @@ export const api = {
updateMapping: (id, body) => request('PUT', `/mappings/${id}`, body),
deleteMapping: (id) => request('DELETE', `/mappings/${id}`),
// Global remap
searchMappingOutputs: (search) => request('GET', `/mappings/outputs?search=${encodeURIComponent(search)}`),
getMappingsByOutputField: (col, val) => request('GET', `/mappings/outputs/${encodeURIComponent(col)}/${encodeURIComponent(val)}`),
remapOutputField: (col, from_val, to_val) => request('POST', '/mappings/remap-field', { col, from_val, to_val }),
// Pivot layouts
getPivotLayouts: (source) => request('GET', `/sources/${source}/layouts`),
savePivotLayout: (source, layout_name, config) => request('POST', `/sources/${source}/layouts`, { layout_name, config }),
deletePivotLayout: (source, id) => request('DELETE', `/sources/${source}/layouts/${id}`),
// Records
getRecords: (source, limit = 100, offset = 0) =>
request('GET', `/records/source/${source}?limit=${limit}&offset=${offset}`),

View File

@ -196,8 +196,24 @@ export default function Import({ source }) {
{error && <p className="text-sm text-red-500 mb-3">{error}</p>}
{result && (
<div className="bg-white border border-gray-200 rounded p-4 mb-4 text-sm">
{result.imported !== undefined ? (
<div className={`border rounded p-4 mb-4 text-sm ${result.success === false ? 'bg-red-50 border-red-200' : 'bg-white border-gray-200'}`}>
{result.success === false ? (
<>
<p className="text-red-600 font-medium mb-2">{result.error}</p>
{result.duplicate_rows && (
<div>
<p className="text-xs text-red-500 mb-1">Offending rows:</p>
<div className="max-h-48 overflow-y-auto bg-white rounded border border-red-100 p-2 font-mono text-xs text-red-700 space-y-0.5">
{result.duplicate_rows.map((row, i) => (
<div key={i}>
{Object.entries(row).map(([f, v]) => `${f}: ${v}`).join(' · ')}
</div>
))}
</div>
</div>
)}
</>
) : result.imported !== undefined ? (
<>
<span className="text-green-600 font-medium">{result.imported} imported</span>
<span className="text-gray-400 mx-2">·</span>

View File

@ -1,5 +1,86 @@
import { useState, useEffect } from 'react'
import { api } from '../api'
import { useState, useEffect, useRef } from 'react'
import { api, authHeaders } from '../api'
function AutocompleteInput({ value, onChange, onEnter, suggestions = [], className, placeholder }) {
const [open, setOpen] = useState(false)
const [highlighted, setHighlighted] = useState(0)
const inputRef = useRef()
const listRef = useRef()
const filtered = value
? suggestions.filter(s => s.toLowerCase().includes(value.toLowerCase()))
: suggestions
function openList() {
setOpen(true)
setHighlighted(0)
}
function select(val) {
onChange(val)
setOpen(false)
inputRef.current?.focus()
}
function handleKeyDown(e) {
if (e.altKey && e.key === 'ArrowDown') {
e.preventDefault()
openList()
return
}
if (open && filtered.length > 0) {
if (e.key === 'Tab') {
e.preventDefault()
setHighlighted(h => (h + 1) % filtered.length)
return
}
if (e.key === 'ArrowDown') { e.preventDefault(); setHighlighted(h => Math.min(h + 1, filtered.length - 1)); return }
if (e.key === 'ArrowUp') { e.preventDefault(); setHighlighted(h => Math.max(h - 1, 0)); return }
if (e.key === 'Enter') { e.preventDefault(); select(filtered[highlighted]); return }
if (e.key === 'Escape') { setOpen(false); return }
}
if (e.key === 'Enter') onEnter?.()
}
// Scroll highlighted item into view
useEffect(() => {
if (!open || !listRef.current) return
const item = listRef.current.children[highlighted]
item?.scrollIntoView({ block: 'nearest' })
}, [highlighted, open])
return (
<div className="relative">
<input
ref={inputRef}
className={className}
value={value}
placeholder={placeholder}
onChange={e => { onChange(e.target.value); if (!open && e.target.value) openList() }}
onKeyDown={handleKeyDown}
onBlur={e => { if (!listRef.current?.contains(e.relatedTarget)) setOpen(false) }}
/>
{open && filtered.length > 0 && (
<div
ref={listRef}
className="absolute z-50 left-0 top-full mt-0.5 bg-white border border-gray-200 rounded shadow-lg max-h-48 overflow-y-auto min-w-full"
>
{filtered.map((s, i) => (
<div
key={s}
className={`px-2 py-1 text-xs cursor-pointer whitespace-nowrap ${
i === highlighted ? 'bg-blue-50 text-blue-700' : 'text-gray-700 hover:bg-gray-50'
}`}
onMouseDown={e => { e.preventDefault(); select(s) }}
>
{s}
</div>
))}
</div>
)}
</div>
)
}
function valueKey(v) {
return Array.isArray(v) ? JSON.stringify(v) : String(v)
@ -35,9 +116,16 @@ export default function Mappings({ source }) {
const [loading, setLoading] = useState(false)
const [importing, setImporting] = useState(false)
const [sortBy, setSortBy] = useState(null)
const [globalValues, setGlobalValues] = useState({})
const [selected, setSelected] = useState(new Set())
const [bulkDraft, setBulkDraft] = useState({})
const [cursorKey, setCursorKey] = useState(null)
const [rowFilter, setRowFilter] = useState('')
const rowRefs = useRef({})
useEffect(() => {
if (!source) return
api.getGlobalValues().then(setGlobalValues).catch(() => {})
api.getRules(source).then(r => setRules(r)).catch(() => {})
}, [source])
@ -52,12 +140,28 @@ export default function Mappings({ source }) {
setAllValues(a)
setDrafts({})
setExtraCols([])
setSelected(new Set())
setBulkDraft({})
setCursorKey(null)
setRowFilter('')
})
.catch(() => {})
.finally(() => setLoading(false))
}, [source, selectedRule])
// Derive output columns and datalist suggestions from mapped rows
// Auto-select all rows matching the regex filter when it changes
useEffect(() => {
if (!rowFilter) return
let re = null
try { re = new RegExp(rowFilter, 'i') } catch { return }
const tabF = filter === 'unmapped' ? allValues.filter(r => !r.is_mapped)
: filter === 'mapped' ? allValues.filter(r => r.is_mapped)
: allValues
const matches = tabF.filter(r => re.test(displayValue(r.extracted_value)))
setSelected(new Set(matches.map(r => valueKey(r.extracted_value))))
}, [rowFilter, filter, allValues])
// Derive output columns and datalist suggestions from mapped rows + global pool
const existingCols = []
const valuesByCol = {}
allValues.forEach(row => {
@ -68,17 +172,31 @@ export default function Mappings({ source }) {
valuesByCol[k].add(String(v))
})
})
// Merge global picklist values into suggestions
Object.entries(globalValues).forEach(([k, vals]) => {
if (!valuesByCol[k]) valuesByCol[k] = new Set()
vals.forEach(v => valuesByCol[k].add(v))
})
const cols = [...existingCols, ...extraCols]
const unmappedCount = allValues.filter(r => !r.is_mapped).length
const mappedCount = allValues.filter(r => r.is_mapped).length
const filteredRows = filter === 'unmapped'
const tabFiltered = filter === 'unmapped'
? allValues.filter(r => !r.is_mapped)
: filter === 'mapped'
? allValues.filter(r => r.is_mapped)
: allValues
let rowFilterRe = null
let rowFilterError = false
if (rowFilter) {
try { rowFilterRe = new RegExp(rowFilter, 'i') } catch { rowFilterError = true }
}
const filteredRows = rowFilterRe
? tabFiltered.filter(r => rowFilterRe.test(displayValue(r.extracted_value)))
: tabFiltered
function toggleSort(col) {
setSortBy(s => {
if (s?.col === col) return { col, dir: s.dir === 'asc' ? 'desc' : 'asc' }
@ -108,7 +226,12 @@ export default function Mappings({ source }) {
function setCellValue(extractedValue, col, value) {
const k = valueKey(extractedValue)
setDrafts(d => ({ ...d, [k]: { ...(d[k] || {}), [col]: value } }))
const targets = selected.has(k) && selected.size > 1 ? [...selected] : [k]
setDrafts(d => {
const next = { ...d }
for (const sk of targets) next[sk] = { ...(next[sk] || {}), [col]: value }
return next
})
}
async function saveRow(row) {
@ -159,6 +282,35 @@ export default function Mappings({ source }) {
return drafts[k] && Object.keys(drafts[k]).length > 0
})
await Promise.all(dirty.map(row => saveRow(row)))
setRowFilter('')
}
async function applyBulk() {
const output = Object.fromEntries(
Object.entries(bulkDraft).filter(([, v]) => v.trim())
)
if (Object.keys(output).length === 0) return
const rows = sortedRows(filteredRows).filter(r => selected.has(valueKey(r.extracted_value)))
await Promise.all(rows.map(async row => {
const k = valueKey(row.extracted_value)
const merged = { ...(row.is_mapped ? row.output : {}), ...output }
setSaving(s => ({ ...s, [k]: true }))
try {
if (row.is_mapped && row.mapping_id) {
const updated = await api.updateMapping(row.mapping_id, { output: merged })
setAllValues(av => av.map(x => valueKey(x.extracted_value) === k ? { ...x, output: updated.output } : x))
} else {
const created = await api.createMapping({ source_name: source, rule_name: row.rule_name, input_value: row.extracted_value, output: merged })
setAllValues(av => av.map(x => valueKey(x.extracted_value) === k ? { ...x, is_mapped: true, mapping_id: created.id, output: merged } : x))
}
} catch (err) {
alert(err.message)
} finally {
setSaving(s => ({ ...s, [k]: false }))
}
}))
setSelected(new Set())
setBulkDraft({})
}
async function deleteRow(row) {
@ -230,6 +382,24 @@ export default function Mappings({ source }) {
</div>
)}
{selectedRule && (
<div className="relative">
<input
className={`text-xs font-mono border rounded px-2 py-1.5 w-44 focus:outline-none focus:border-blue-400 ${
rowFilterError ? 'border-red-400 bg-red-50' : rowFilter ? 'border-blue-300' : 'border-gray-200'
}`}
placeholder="filter regex…"
value={rowFilter}
onChange={e => setRowFilter(e.target.value)}
/>
{rowFilter && !rowFilterError && (
<span className="absolute right-2 top-1/2 -translate-y-1/2 text-xs text-gray-400">
{filteredRows.length}
</span>
)}
</div>
)}
{dirtyCount > 0 && (
<button
onClick={saveAllPending}
@ -241,13 +411,26 @@ export default function Mappings({ source }) {
<div className="ml-auto flex items-center gap-2">
{selectedRule && (
<a
href={api.exportMappingsUrl(source, selectedRule)}
download
<button
onClick={async () => {
try {
const url = api.exportMappingsUrl(source, selectedRule)
const res = await fetch(url, { headers: authHeaders() })
if (!res.ok) throw new Error('Export failed')
const blob = await res.blob()
const a = document.createElement('a')
a.href = URL.createObjectURL(blob)
a.download = `mappings_${source}.tsv`
a.click()
URL.revokeObjectURL(a.href)
} catch (err) {
alert(err.message)
}
}}
className="text-sm px-3 py-1.5 border border-gray-200 rounded hover:bg-gray-50 text-gray-600"
>
Export TSV
</a>
</button>
)}
<label className={`text-sm px-3 py-1.5 border border-gray-200 rounded cursor-pointer hover:bg-gray-50 text-gray-600 ${importing ? 'opacity-50 pointer-events-none' : ''}`}>
{importing ? 'Importing…' : 'Import TSV'}
@ -269,16 +452,49 @@ export default function Mappings({ source }) {
)}
{selectedRule && !loading && allValues.length > 0 && (
<div className="overflow-x-auto">
{cols.map(col => (
<datalist key={col} id={`dl-${col}`}>
{[...(valuesByCol[col] || [])].sort().map(v => (
<option key={v} value={v} />
{/* Bulk assign bar */}
{selected.size > 0 && (
<div className="flex items-center gap-2 mb-2 p-2 bg-blue-50 border border-blue-200 rounded flex-wrap">
<span className="text-xs text-blue-700 font-medium whitespace-nowrap">{selected.size} selected</span>
{cols.map(col => (
<AutocompleteInput
key={col}
className="border border-blue-300 rounded px-2 py-1 text-xs min-w-24 focus:outline-none focus:border-blue-500 bg-white"
placeholder={col}
value={bulkDraft[col] || ''}
onChange={v => setBulkDraft(d => ({ ...d, [col]: v }))}
suggestions={[...(valuesByCol[col] || [])].sort()}
/>
))}
</datalist>
))}
<button
onClick={applyBulk}
disabled={Object.values(bulkDraft).every(v => !v.trim())}
className="text-xs bg-blue-600 text-white px-3 py-1 rounded hover:bg-blue-700 disabled:opacity-40 whitespace-nowrap"
>
Apply to {selected.size}
</button>
<button
onClick={() => { setSelected(new Set()); setBulkDraft({}) }}
className="text-xs text-blue-400 hover:text-blue-600"
>
cancel
</button>
</div>
)}
<table className="w-full text-xs bg-white border border-gray-200 rounded">
<thead>
<tr className="text-left text-gray-400 border-b border-gray-100 bg-gray-50">
<th className="px-2 py-2 w-6">
<input
type="checkbox"
className="cursor-pointer"
checked={displayRows.length > 0 && displayRows.every(r => selected.has(valueKey(r.extracted_value)))}
onChange={e => {
if (e.target.checked) setSelected(new Set(displayRows.map(r => valueKey(r.extracted_value))))
else setSelected(new Set())
}}
/>
</th>
<SortHeader col="input_value" label="input_value" sortBy={sortBy} onSort={toggleSort} />
<SortHeader col="count" label="count" sortBy={sortBy} onSort={toggleSort} className="text-right" />
{existingCols.map(col => (
@ -297,7 +513,7 @@ export default function Mappings({ source }) {
<th className="px-2 py-2">
<button
onClick={() => setExtraCols(ec => [...ec, ''])}
className="text-gray-300 hover:text-gray-500"
className="text-gray-400 hover:text-gray-700 font-medium"
title="Add column"
>+</button>
</th>
@ -308,9 +524,29 @@ export default function Mappings({ source }) {
<tbody>
{displayRows.map(row => {
const k = valueKey(row.extracted_value)
const rowIdx = displayRows.indexOf(row)
const isSaving = saving[k]
const isSelected = selected.has(k)
const hasDraft = !!(drafts[k] && Object.keys(drafts[k]).length > 0)
const rowBg = hasDraft ? 'bg-blue-50' : row.is_mapped ? '' : 'bg-yellow-50'
const rowBg = isSelected ? 'bg-blue-50' : hasDraft ? 'bg-blue-50' : row.is_mapped ? '' : 'bg-yellow-50'
function handleRowClick(e) {
if (e.target.closest('input,button,a,select')) return
setSelected(s => { const n = new Set(s); n.has(k) ? n.delete(k) : n.add(k); return n })
setCursorKey(k)
}
function handleRowKeyDown(e) {
if (!e.shiftKey || (e.key !== 'ArrowDown' && e.key !== 'ArrowUp')) return
e.preventDefault()
const delta = e.key === 'ArrowDown' ? 1 : -1
const curIdx = cursorKey ? displayRows.findIndex(r => valueKey(r.extracted_value) === cursorKey) : rowIdx
const nextIdx = Math.max(0, Math.min(displayRows.length - 1, curIdx + delta))
const nextKey = valueKey(displayRows[nextIdx].extracted_value)
setSelected(s => new Set([...s, nextKey]))
setCursorKey(nextKey)
rowRefs.current[nextKey]?.focus()
}
const samples = row.sample
? (Array.isArray(row.sample) ? row.sample : [row.sample])
: []
@ -323,19 +559,37 @@ export default function Mappings({ source }) {
return (
<>
<tr key={k} className={`border-t border-gray-50 hover:bg-gray-50 ${rowBg}`}>
<tr
key={k}
ref={el => rowRefs.current[k] = el}
tabIndex={0}
className={`border-t border-gray-50 hover:bg-gray-50 cursor-pointer outline-none ${rowBg}`}
onClick={handleRowClick}
onKeyDown={handleRowKeyDown}
>
<td className="px-2 py-1.5">
<input
type="checkbox"
className="cursor-pointer"
checked={isSelected}
onChange={() => {
setSelected(s => { const n = new Set(s); n.has(k) ? n.delete(k) : n.add(k); return n })
setCursorKey(k)
}}
/>
</td>
<td className="px-3 py-1.5 font-mono text-gray-800 whitespace-nowrap">{displayValue(row.extracted_value)}</td>
<td className="px-3 py-1.5 text-right text-gray-400">{row.record_count}</td>
{cols.map(col => (
<td key={col} className="px-3 py-1.5">
<input
list={`dl-${col}`}
<AutocompleteInput
className={`border rounded px-2 py-1 w-full min-w-24 focus:outline-none focus:border-blue-400 ${
hasDraft ? 'border-blue-300' : row.is_mapped ? 'border-gray-200' : 'border-yellow-300'
}`}
value={cellVal(col)}
onChange={e => setCellValue(row.extracted_value, col, e.target.value)}
onKeyDown={e => e.key === 'Enter' && saveRow(row)}
onChange={v => setCellValue(row.extracted_value, col, v)}
onEnter={() => saveRow(row)}
suggestions={[...(valuesByCol[col] || [])].sort()}
/>
</td>
))}
@ -373,7 +627,7 @@ export default function Mappings({ source }) {
const sampleCols = [...new Set(samples.flatMap(r => Object.keys(r)))]
return (
<tr key={`${k}-sample`} className="border-t border-gray-50 bg-gray-50">
<td colSpan={2 + cols.length + 4} className="px-3 py-2">
<td colSpan={3 + cols.length + 4} className="px-3 py-2">
<table className="w-full text-xs border border-gray-100 rounded bg-white">
<thead>
<tr className="bg-gray-50 border-b border-gray-100">

503
ui/src/pages/Pivot.jsx Normal file
View File

@ -0,0 +1,503 @@
import { useEffect, useRef, useState, useCallback } from 'react'
import { api } from '../api'
async function fetchAllRows(source) {
const res = await api.getViewData(source, 100000, 0)
return res.rows || []
}
let perspectivePromise = null
function loadPerspective() {
if (perspectivePromise) return perspectivePromise
perspectivePromise = (async () => {
if (!document.getElementById('psp-theme')) {
const link = document.createElement('link')
link.id = 'psp-theme'
link.rel = 'stylesheet'
link.crossOrigin = 'anonymous'
link.href = 'https://cdn.jsdelivr.net/npm/@perspective-dev/viewer/dist/css/themes.css'
document.head.appendChild(link)
}
const [{ default: perspective }] = await Promise.all([
import(/* @vite-ignore */ 'https://cdn.jsdelivr.net/npm/@perspective-dev/client@4.4.0/dist/cdn/perspective.js'),
import(/* @vite-ignore */ 'https://cdn.jsdelivr.net/npm/@perspective-dev/viewer@4.4.0/dist/cdn/perspective-viewer.js'),
import(/* @vite-ignore */ 'https://cdn.jsdelivr.net/npm/@perspective-dev/viewer-datagrid@4.4.0/dist/cdn/perspective-viewer-datagrid.js'),
import(/* @vite-ignore */ 'https://cdn.jsdelivr.net/npm/@perspective-dev/viewer-d3fc@4.4.0/dist/cdn/perspective-viewer-d3fc.js'),
])
return perspective
})()
return perspectivePromise
}
function formatVal(v, decimals = 2) {
if (v == null) return null
if (typeof v === 'number') {
if (v > 1e11 && v < 2e12) {
const d = new Date(v)
if (!isNaN(d)) return d.toISOString().slice(0, 10)
}
return v.toLocaleString(undefined, { minimumFractionDigits: decimals, maximumFractionDigits: decimals })
}
return String(v)
}
function normalize(v) {
if (v == null) return null
if (typeof v === 'number' && v > 1e11 && v < 2e12) return new Date(v).toISOString().slice(0, 10)
return String(v).trim()
}
function filterRowsByConfig(allRows, filters) {
if (!filters || filters.length === 0) return allRows
const knownFields = allRows.length > 0 ? new Set(Object.keys(allRows[0])) : new Set()
const applicable = filters.filter(([field]) => knownFields.has(field))
if (applicable.length === 0) return allRows
return allRows.filter(row =>
applicable.every(([field, op, value]) => {
const rawVal = row[field]
if (rawVal == null) return op === '!=' || op === 'not contains'
const a = normalize(rawVal)
const b = value != null ? String(value).trim() : ''
const aNum = parseFloat(a), bNum = parseFloat(b)
const numeric = !isNaN(aNum) && !isNaN(bNum)
switch (op) {
case '==': return a === b
case '!=': return a !== b
case '>': return numeric ? aNum > bNum : a > b
case '>=': return numeric ? aNum >= bNum : a >= b
case '<': return numeric ? aNum < bNum : a < b
case '<=': return numeric ? aNum <= bNum : a <= b
case 'contains': return a.toLowerCase().includes(b.toLowerCase())
case 'not contains': return !a.toLowerCase().includes(b.toLowerCase())
default: return true
}
})
)
}
const LAYOUT_KEY = (source) => `psp_layout_${source}`
const DEFAULT_PLUGIN_CONFIG = { edit_mode: 'SELECT_REGION' }
export default function Pivot({ source }) {
const viewerRef = useRef()
const workerRef = useRef()
const tableRef = useRef()
const allRowsRef = useRef([])
const expandDepthRef = useRef(null)
const [status, setStatus] = useState('idle')
const [error, setError] = useState('')
const [inspectedRows, setInspectedRows] = useState(null)
const [clickDetail, setClickDetail] = useState(null)
const [decimals, setDecimals] = useState(2)
// Named layouts
const [layouts, setLayouts] = useState([])
const [activeLayoutId, setActiveLayoutId] = useState(null)
const [saveAsName, setSaveAsName] = useState('')
const [showSaveAs, setShowSaveAs] = useState(false)
const [layoutMsg, setLayoutMsg] = useState('')
const flashMsg = (msg) => {
setLayoutMsg(msg)
setTimeout(() => setLayoutMsg(''), 2000)
}
const loadLayouts = useCallback(async () => {
if (!source) return
try {
const rows = await api.getPivotLayouts(source)
setLayouts(rows)
} catch {}
}, [source])
useEffect(() => {
if (!source) return
let cancelled = false
setInspectedRows(null)
setClickDetail(null)
setActiveLayoutId(null)
setShowSaveAs(false)
allRowsRef.current = []
loadLayouts()
async function init() {
setStatus('loading')
setError('')
try {
const [perspective, rows] = await Promise.all([
loadPerspective(),
fetchAllRows(source),
])
if (cancelled) return
if (!rows.length) { setStatus('noview'); return }
allRowsRef.current = rows
if (workerRef.current) { try { workerRef.current.terminate() } catch {} }
const worker = await perspective.worker()
if (cancelled) { worker.terminate(); return }
workerRef.current = worker
const table = await worker.table(rows, { name: source })
if (cancelled) return
tableRef.current = table
const viewer = viewerRef.current
viewer.addEventListener('perspective-click', async (e) => {
const detail = e.detail || {}
const { row, column_names } = detail
if (!row) return
const eventFilters = (detail.config || {}).filter || []
const config = await viewer.save()
// Without a group_by hierarchy there are no coordinate filters, so the
// query would return the entire dataset skip the inspector in that case
const hasHierarchy = (config.group_by || []).length > 0
if (!hasHierarchy) return
setClickDetail({ row, config, column_names, eventFilters })
// Use a Perspective view with the event filters + expressions so computed
// columns (split_by) are evaluated and filtered correctly
try {
const view = await tableRef.current.view({
filter: eventFilters,
expressions: config.expressions || [],
})
const data = await view.to_json()
await view.delete()
// Strip expression columns only show raw source columns
const exprNames = new Set(Object.keys(config.expressions || {}))
const cleaned = data.map(r =>
Object.fromEntries(Object.entries(r).filter(([k]) => !exprNames.has(k)))
)
setInspectedRows(cleaned)
} catch {
setInspectedRows(filterRowsByConfig(allRowsRef.current, eventFilters))
}
})
await viewer.load(worker)
const plugin = await viewer.getPlugin()
const savedLayout = localStorage.getItem(LAYOUT_KEY(source))
if (savedLayout) {
const parsed = JSON.parse(savedLayout)
await viewer.restore(parsed)
await plugin.restore(parsed.plugin_config || DEFAULT_PLUGIN_CONFIG)
if (parsed.expand_depth != null) await applyExpandDepth(viewer, parsed.expand_depth)
} else {
await viewer.restore({ table: source, settings: false, plugin_config: DEFAULT_PLUGIN_CONFIG })
await plugin.restore(DEFAULT_PLUGIN_CONFIG)
}
await viewer.flush()
setStatus('ready')
} catch (err) {
if (!cancelled) { setStatus('error'); setError(err.message) }
}
}
init()
return () => { cancelled = true }
}, [source])
async function applyExpandDepth(viewer, depth) {
if (depth == null) return
const view = await viewer.getView()
await view.set_depth(depth)
const plugin = await viewer.getPlugin()
await plugin.draw(view)
expandDepthRef.current = depth
}
async function applyLayout(layout) {
const viewer = viewerRef.current
if (!viewer) return
await viewer.restore(layout.config)
if (layout.config.plugin_config) {
const plugin = await viewer.getPlugin()
await plugin.restore(layout.config.plugin_config)
}
await applyExpandDepth(viewer, layout.config.expand_depth ?? null)
setActiveLayoutId(layout.id)
// also persist to localStorage so it survives refresh
localStorage.setItem(LAYOUT_KEY(source), JSON.stringify(layout.config))
}
async function captureConfig() {
const viewer = viewerRef.current
if (!viewer) return null
const plugin = await viewer.getPlugin()
const [viewerConfig, pluginConfig] = await Promise.all([viewer.save(), plugin.save()])
return { ...viewerConfig, plugin_config: pluginConfig, expand_depth: expandDepthRef.current }
}
async function handleSaveOver() {
const layout = layouts.find(l => l.id === activeLayoutId)
if (!layout) return
const config = await captureConfig()
if (!config) return
try {
const saved = await api.savePivotLayout(source, layout.layout_name, config)
localStorage.setItem(LAYOUT_KEY(source), JSON.stringify(config))
await loadLayouts()
setActiveLayoutId(saved.id)
flashMsg('Saved!')
} catch (err) {
flashMsg(err.message)
}
}
async function handleSaveAs() {
const name = saveAsName.trim()
if (!name) return
const config = await captureConfig()
if (!config) return
try {
const saved = await api.savePivotLayout(source, name, config)
localStorage.setItem(LAYOUT_KEY(source), JSON.stringify(config))
await loadLayouts()
setActiveLayoutId(saved.id)
setShowSaveAs(false)
setSaveAsName('')
flashMsg('Saved!')
} catch (err) {
flashMsg(err.message)
}
}
async function handleDelete(layout, e) {
e.stopPropagation()
try {
await api.deletePivotLayout(source, layout.id)
if (activeLayoutId === layout.id) setActiveLayoutId(null)
await loadLayouts()
flashMsg('Deleted')
} catch (err) {
flashMsg(err.message)
}
}
function handleResetToDefault() {
const viewer = viewerRef.current
if (!viewer) return
localStorage.removeItem(LAYOUT_KEY(source))
setActiveLayoutId(null)
viewer.restore({ table: source, settings: true, plugin_config: DEFAULT_PLUGIN_CONFIG })
}
if (!source) return <div className="p-6 text-sm text-gray-400">Select a source first.</div>
const cols = inspectedRows?.length ? Object.keys(inspectedRows[0]) : []
const groupBy = clickDetail?.config?.group_by || []
const splitBy = clickDetail?.config?.split_by || []
const coordFields = new Set([...groupBy, ...splitBy])
const coordMap = Object.fromEntries(
(clickDetail?.eventFilters || [])
.filter(([f, op]) => coordFields.has(f) && op === '==')
.map(([f, , v]) => [f, v])
)
const cellCoords = [...groupBy, ...splitBy].map(f => coordMap[f]).filter(Boolean)
const splitVals = splitBy.map(f => coordMap[f]).filter(Boolean)
const metrics = clickDetail?.column_names || []
const cellKey = splitVals.length > 0 && metrics.length > 0
? [...splitVals, ...metrics].join('|')
: null
return (
<div className="w-full h-full flex flex-col">
{/* Layout toolbar */}
<div className="flex items-center gap-2 px-3 py-1.5 bg-white border-b border-gray-200 flex-shrink-0">
<span className="text-xs text-gray-400 uppercase tracking-wide mr-1">Layouts</span>
{layouts.map(l => (
<div key={l.id}
onClick={() => applyLayout(l)}
className={`flex items-center gap-1 text-xs rounded px-2 py-0.5 cursor-pointer border transition-colors
${activeLayoutId === l.id
? 'bg-blue-50 border-blue-300 text-blue-700'
: 'bg-white border-gray-200 text-gray-600 hover:border-gray-400'}`}>
{l.layout_name}
<button
onClick={(e) => handleDelete(l, e)}
className="text-gray-300 hover:text-red-400 leading-none ml-0.5 text-sm">×</button>
</div>
))}
{activeLayoutId !== null && !showSaveAs && (
<button onClick={handleSaveOver}
className="text-xs text-blue-500 hover:text-blue-700 border border-blue-200 rounded px-2 py-0.5">
Save
</button>
)}
{showSaveAs ? (
<div className="flex items-center gap-1">
<input
autoFocus
value={saveAsName}
onChange={e => setSaveAsName(e.target.value)}
onKeyDown={e => { if (e.key === 'Enter') handleSaveAs(); if (e.key === 'Escape') { setShowSaveAs(false); setSaveAsName('') } }}
placeholder="Layout name…"
className="text-xs border border-gray-300 rounded px-2 py-0.5 w-36 focus:outline-none focus:border-blue-400"
/>
<button onClick={handleSaveAs} className="text-xs text-blue-600 hover:text-blue-800 px-1">Save</button>
<button onClick={() => { setShowSaveAs(false); setSaveAsName('') }} className="text-xs text-gray-400 hover:text-gray-600 px-1">Cancel</button>
</div>
) : (
<button
onClick={() => setShowSaveAs(true)}
className="text-xs text-gray-400 hover:text-gray-600 border border-dashed border-gray-200 rounded px-2 py-0.5">
+ Save as
</button>
)}
{activeLayoutId !== null && (
<button onClick={handleResetToDefault}
className="text-xs text-gray-300 hover:text-gray-500 ml-1">
reset
</button>
)}
{layoutMsg && <span className="text-xs text-green-600 ml-1">{layoutMsg}</span>}
<div className="ml-auto flex items-center gap-1">
<span className="text-xs text-gray-400">depth:</span>
{[0, 1, 2, 3].map(d => (
<button key={d} onClick={async () => {
const v = viewerRef.current; if (!v) return
const view = await v.getView()
await view.set_depth(d)
const p = await v.getPlugin()
await p.draw(view)
expandDepthRef.current = d
}} className="text-xs border border-gray-200 rounded px-1.5 py-0.5 text-gray-500 hover:border-gray-400">
{d}
</button>
))}
</div>
</div>
{/* Pivot + inspector */}
<div className="relative flex-1 flex min-h-0">
<div className="relative flex-1">
{status === 'loading' && (
<div className="absolute inset-0 flex items-center justify-center z-10 bg-gray-50">
<p className="text-sm text-gray-400">Loading</p>
</div>
)}
{status === 'error' && (
<div className="absolute inset-0 flex items-center justify-center z-10 bg-gray-50">
<p className="text-sm text-red-500">Error: {error}</p>
</div>
)}
{status === 'noview' && (
<div className="absolute inset-0 flex items-center justify-center z-10 bg-gray-50">
<p className="text-sm text-gray-400">No view data generate a view and transform records first.</p>
</div>
)}
<perspective-viewer
ref={viewerRef}
style={{ position: 'absolute', top: 0, left: 0, right: 0, bottom: 0 }}
/>
</div>
{inspectedRows && clickDetail && (
<div className="w-96 border-l border-gray-200 bg-white flex flex-col overflow-hidden flex-shrink-0">
<div className="flex items-center justify-between px-3 py-2 border-b border-gray-100">
<span className="text-xs font-semibold text-gray-600 uppercase tracking-wide">
{inspectedRows.length} row{inspectedRows.length !== 1 ? 's' : ''}
</span>
<div className="flex items-center gap-2">
<div className="flex items-center gap-0.5">
<button onClick={() => setDecimals(d => Math.max(0, d - 1))}
className="text-xs text-gray-400 hover:text-gray-600 w-4 text-center"></button>
<span className="text-xs text-gray-400 w-4 text-center">{decimals}</span>
<button onClick={() => setDecimals(d => Math.min(8, d + 1))}
className="text-xs text-gray-400 hover:text-gray-600 w-4 text-center">+</button>
</div>
<button onClick={() => { setInspectedRows(null); setClickDetail(null) }}
className="text-gray-300 hover:text-gray-500 leading-none text-lg">×</button>
</div>
</div>
<div className="flex-1 overflow-y-auto">
{/* Cell coordinates */}
<div className="px-3 py-2 border-b border-gray-100">
<div className="text-xs text-gray-400 uppercase tracking-wide mb-1">
{[...groupBy, ...splitBy].join(' ') || clickDetail.column_names?.join(', ') || 'Cell'}
</div>
{cellCoords.length > 0 && (
<div className="text-xs text-gray-700 font-mono font-semibold">
{cellCoords.join(' ')}
</div>
)}
{Object.entries(clickDetail.row)
.filter(([k, v]) => k !== '__ROW_PATH__' && v != null)
.map(([k, v]) => {
const isSelected = cellKey != null && k === cellKey
return (
<div key={k} className={`flex justify-between py-0.5 gap-2 ${isSelected ? 'font-semibold' : ''}`}>
<span className={`text-xs font-mono shrink-0 ${isSelected ? 'text-gray-700' : 'text-gray-400'}`}>{k}</span>
<span className={`text-xs font-mono text-right ${isSelected ? 'text-blue-600' : 'text-gray-700'}`}>{formatVal(v, decimals)}</span>
</div>
)
})}
</div>
{/* User-set filters */}
{(() => {
const userFilters = (clickDetail.eventFilters || []).filter(([f]) => !coordFields.has(f))
return userFilters.length > 0 ? (
<div className="px-3 py-2 border-b border-gray-100">
<div className="text-xs text-gray-400 uppercase tracking-wide mb-1">Filters</div>
{userFilters.map((f, i) => (
<div key={i} className="text-xs text-gray-500 py-0.5 font-mono">{f.join(' ')}</div>
))}
</div>
) : null
})()}
{/* Underlying rows */}
{inspectedRows.length > 0 && (
<div className="overflow-auto">
<table className="w-full text-xs">
<thead>
<tr className="text-left text-gray-400 border-b border-gray-100 bg-gray-50 sticky top-0">
{cols.map(c => (
<th key={c} className="px-2 py-1 font-medium whitespace-nowrap">{c}</th>
))}
</tr>
</thead>
<tbody>
{inspectedRows.map((row, i) => (
<tr key={i} className="border-t border-gray-50 hover:bg-gray-50">
{cols.map(c => {
const f = formatVal(row[c], decimals)
return (
<td key={c} className="px-2 py-1 font-mono whitespace-nowrap text-gray-700 max-w-40 truncate">
{f == null ? <span className="text-gray-300"></span> : f}
</td>
)
})}
</tr>
))}
</tbody>
</table>
</div>
)}
</div>
</div>
)}
</div>
</div>
)
}

View File

@ -1,4 +1,4 @@
import { useState, useEffect } from 'react'
import { useState, useEffect, useRef } from 'react'
import { api } from '../api'
const DATE_RE = /^\d{4}-\d{2}-\d{2}(T[\d:.Z+-]+)?$/
@ -20,47 +20,79 @@ function formatVal(val) {
export default function Records({ source }) {
const [rows, setRows] = useState([])
const [cols, setCols] = useState([])
const [exists, setExists] = useState(null)
const [offset, setOffset] = useState(0)
const [loading, setLoading] = useState(false)
const [viewError, setViewError] = useState(null)
const [sort, setSort] = useState({ col: null, dir: 'asc' })
const [filters, setFilters] = useState([])
const debounceRef = useRef(null)
const LIMIT = 100
useEffect(() => {
if (!source) return
setOffset(0)
setSort({ col: null, dir: 'asc' })
load(0, null, 'asc')
setFilters([])
setViewError(null)
load(0, null, 'asc', [])
}, [source])
async function load(off, col, dir) {
async function load(off, col, dir, filt) {
setLoading(true)
try {
const res = await api.getViewData(source, LIMIT, off, col, dir)
const active = (filt || []).filter(f => f.col && f.pattern)
const res = await api.getViewData(source, LIMIT, off, col, dir, active)
setExists(res.exists)
setRows(res.rows)
if (res.rows.length > 0 && cols.length === 0) setCols(Object.keys(res.rows[0]))
else if (res.rows.length > 0) setCols(Object.keys(res.rows[0]))
} catch (err) {
console.error(err)
setViewError(err.message)
} finally {
setLoading(false)
}
}
function triggerLoad(off, col, dir, filt) {
clearTimeout(debounceRef.current)
debounceRef.current = setTimeout(() => load(off, col, dir, filt), 350)
}
function toggleSort(col) {
const next = sort.col === col
? { col, dir: sort.dir === 'asc' ? 'desc' : 'asc' }
: { col, dir: 'asc' }
setSort(next)
setOffset(0)
load(0, next.col, next.dir)
load(0, next.col, next.dir, filters)
}
function prev() { const o = Math.max(0, offset - LIMIT); setOffset(o); load(o, sort.col, sort.dir) }
function next() { const o = offset + LIMIT; setOffset(o); load(o, sort.col, sort.dir) }
function addFilter() {
setFilters(f => [...f, { col: cols[0] || '', pattern: '' }])
}
function removeFilter(i) {
const next = filters.filter((_, idx) => idx !== i)
setFilters(next)
setOffset(0)
load(0, sort.col, sort.dir, next)
}
function updateFilter(i, key, val) {
const next = filters.map((f, idx) => idx === i ? { ...f, [key]: val } : f)
setFilters(next)
setOffset(0)
triggerLoad(0, sort.col, sort.dir, next)
}
function prev() { const o = Math.max(0, offset - LIMIT); setOffset(o); load(o, sort.col, sort.dir, filters) }
function next() { const o = offset + LIMIT; setOffset(o); load(o, sort.col, sort.dir, filters) }
if (!source) return <div className="p-6 text-sm text-gray-400">Select a source first.</div>
const cols = rows.length > 0 ? Object.keys(rows[0]) : []
const displayCols = rows.length > 0 ? Object.keys(rows[0]) : cols
return (
<div className="p-6">
@ -71,8 +103,54 @@ export default function Records({ source }) {
)}
</div>
{/* Filter bar */}
{exists !== false && displayCols.length > 0 && (
<div className="mb-4 flex flex-wrap gap-2 items-center">
{filters.map((f, i) => (
<div key={i} className="flex items-center gap-1 bg-white border border-gray-200 rounded px-2 py-1">
<select
className="text-xs text-gray-600 border-0 focus:outline-none bg-transparent"
value={f.col}
onChange={e => updateFilter(i, 'col', e.target.value)}
>
{displayCols.map(c => <option key={c} value={c}>{c}</option>)}
</select>
<span className="text-xs text-gray-300 mx-0.5">~*</span>
<input
className="text-xs font-mono border-0 focus:outline-none w-36 bg-transparent"
placeholder="regex…"
value={f.pattern}
onChange={e => updateFilter(i, 'pattern', e.target.value)}
/>
<button
onClick={() => removeFilter(i)}
className="text-gray-300 hover:text-gray-500 ml-1 leading-none"
>×</button>
</div>
))}
<button
onClick={addFilter}
className="text-xs text-gray-400 hover:text-gray-600 border border-dashed border-gray-200 rounded px-2 py-1"
>
+ filter
</button>
{filters.length > 0 && (
<button
onClick={() => { setFilters([]); setOffset(0); load(0, sort.col, sort.dir, []) }}
className="text-xs text-gray-400 hover:text-red-500"
>
clear
</button>
)}
</div>
)}
{loading && <p className="text-sm text-gray-400">Loading</p>}
{!loading && viewError && (
<p className="text-sm text-red-500">View error: {viewError} check field types in Sources.</p>
)}
{!loading && exists === false && (
<p className="text-sm text-gray-400">
No view generated yet. Go to <span className="font-medium text-gray-600">Sources</span>, check fields as <span className="font-medium text-gray-600">In view</span>, then click <span className="font-medium text-gray-600">Generate view</span>.
@ -80,7 +158,9 @@ export default function Records({ source }) {
)}
{!loading && exists && rows.length === 0 && (
<p className="text-sm text-gray-400">View exists but no transformed records yet. Import data and run a transform first.</p>
<p className="text-sm text-gray-400">
{filters.some(f => f.col && f.pattern) ? 'No records match the current filters.' : 'View exists but no transformed records yet. Import data and run a transform first.'}
</p>
)}
{!loading && exists && rows.length > 0 && (
@ -89,7 +169,7 @@ export default function Records({ source }) {
<table className="w-full text-sm">
<thead>
<tr className="text-left text-xs text-gray-400 border-b border-gray-100 bg-gray-50">
{cols.map(col => {
{displayCols.map(col => {
const active = sort.col === col
return (
<th
@ -109,7 +189,7 @@ export default function Records({ source }) {
<tbody>
{rows.map((row, i) => (
<tr key={i} className="border-t border-gray-50 hover:bg-gray-50">
{cols.map((col, j) => {
{displayCols.map((col, j) => {
const formatted = formatVal(row[col])
return (
<td key={j} className="px-3 py-2 text-xs text-gray-600 whitespace-nowrap max-w-48 truncate">

214
ui/src/pages/Remap.jsx Normal file
View File

@ -0,0 +1,214 @@
import { useState, useRef } from 'react'
import { api } from '../api'
export default function Remap() {
const [search, setSearch] = useState('')
const [results, setResults] = useState(null)
const [searching, setSearching] = useState(false)
const [selected, setSelected] = useState(null) // { col, val }
const [matches, setMatches] = useState(null) // individual mappings
const [loadingMatches, setLoadingMatches] = useState(false)
const [toVal, setToVal] = useState('')
const [applying, setApplying] = useState(false)
const [msg, setMsg] = useState(null) // { text, ok }
const searchRef = useRef()
async function handleSearch(e) {
e.preventDefault()
const q = search.trim()
if (!q) return
setSearching(true)
setResults(null)
setSelected(null)
setMatches(null)
setMsg(null)
try {
const rows = await api.searchMappingOutputs(q)
setResults(rows)
} catch (err) {
setMsg({ text: err.message, ok: false })
} finally {
setSearching(false)
}
}
async function handleSelect(row) {
setSelected(row)
setToVal(row.val)
setMatches(null)
setMsg(null)
setLoadingMatches(true)
try {
const rows = await api.getMappingsByOutputField(row.col, row.val)
setMatches(rows)
} catch (err) {
setMsg({ text: err.message, ok: false })
} finally {
setLoadingMatches(false)
}
}
async function handleApply() {
if (!selected || !toVal.trim() || toVal === selected.val) return
setApplying(true)
setMsg(null)
try {
const { updated } = await api.remapOutputField(selected.col, selected.val, toVal.trim())
setMsg({ text: `Updated ${updated} mapping${updated !== 1 ? 's' : ''}.`, ok: true })
// Refresh match list to show new values
const rows = await api.getMappingsByOutputField(selected.col, toVal.trim())
setMatches(rows)
setSelected({ ...selected, val: toVal.trim() })
// Re-run search to refresh counts
const refreshed = await api.searchMappingOutputs(search.trim())
setResults(refreshed)
} catch (err) {
setMsg({ text: err.message, ok: false })
} finally {
setApplying(false)
}
}
return (
<div className="p-6 max-w-4xl">
<h1 className="text-base font-semibold text-gray-800 mb-4">Remap Output Values</h1>
{/* Search */}
<form onSubmit={handleSearch} className="flex items-center gap-2 mb-5">
<input
ref={searchRef}
value={search}
onChange={e => setSearch(e.target.value)}
placeholder="Search output values…"
className="text-sm border border-gray-300 rounded px-3 py-1.5 w-72 focus:outline-none focus:border-blue-400"
/>
<button type="submit" disabled={searching}
className="text-sm bg-blue-600 text-white rounded px-3 py-1.5 hover:bg-blue-700 disabled:opacity-50">
{searching ? 'Searching…' : 'Search'}
</button>
</form>
{/* Search results */}
{results !== null && (
<div className="mb-6">
{results.length === 0 ? (
<p className="text-sm text-gray-400">No matching output values found.</p>
) : (
<>
<div className="text-xs text-gray-400 uppercase tracking-wide mb-1">
{results.length} result{results.length !== 1 ? 's' : ''} click one to remap
</div>
<table className="w-full text-sm border border-gray-200 rounded overflow-hidden">
<thead>
<tr className="bg-gray-50 text-left text-xs text-gray-400 uppercase tracking-wide">
<th className="px-3 py-2">Field</th>
<th className="px-3 py-2">Value</th>
<th className="px-3 py-2 text-right">Mappings</th>
</tr>
</thead>
<tbody>
{results.map((r, i) => {
const isActive = selected?.col === r.col && selected?.val === r.val
return (
<tr key={i}
onClick={() => handleSelect(r)}
className={`border-t border-gray-100 cursor-pointer transition-colors
${isActive ? 'bg-blue-50' : 'hover:bg-gray-50'}`}>
<td className="px-3 py-2 font-mono text-gray-500">{r.col}</td>
<td className="px-3 py-2 font-mono text-gray-800">{r.val}</td>
<td className="px-3 py-2 text-right text-gray-400">{r.mapping_count}</td>
</tr>
)
})}
</tbody>
</table>
</>
)}
</div>
)}
{/* Remap panel */}
{selected && (
<div className="border border-gray-200 rounded p-4 mb-6 bg-white">
<div className="text-xs text-gray-400 uppercase tracking-wide mb-3">
Remap <span className="font-mono text-gray-600">{selected.col}</span>
</div>
<div className="flex items-center gap-3 mb-4">
<div className="flex-1">
<div className="text-xs text-gray-400 mb-1">From</div>
<div className="text-sm font-mono bg-gray-50 border border-gray-200 rounded px-3 py-1.5 text-gray-700">
{selected.val}
</div>
</div>
<div className="text-gray-300 mt-4"></div>
<div className="flex-1">
<div className="text-xs text-gray-400 mb-1">To</div>
<input
value={toVal}
onChange={e => setToVal(e.target.value)}
onKeyDown={e => e.key === 'Enter' && handleApply()}
className="w-full text-sm font-mono border border-gray-300 rounded px-3 py-1.5 focus:outline-none focus:border-blue-400"
/>
</div>
<div className="mt-4">
<button
onClick={handleApply}
disabled={applying || !toVal.trim() || toVal.trim() === selected.val}
className="text-sm bg-blue-600 text-white rounded px-3 py-1.5 hover:bg-blue-700 disabled:opacity-40 whitespace-nowrap">
{applying ? 'Applying…' : `Apply to all ${matches?.length ?? '…'}`}
</button>
</div>
</div>
{msg && (
<div className={`text-sm mb-3 ${msg.ok ? 'text-green-600' : 'text-red-500'}`}>
{msg.text}
</div>
)}
{/* Affected mappings */}
{loadingMatches ? (
<p className="text-xs text-gray-400">Loading</p>
) : matches && matches.length > 0 && (
<div>
<div className="text-xs text-gray-400 uppercase tracking-wide mb-1">
Affected mappings
</div>
<table className="w-full text-xs border border-gray-100 rounded overflow-hidden">
<thead>
<tr className="bg-gray-50 text-left text-gray-400">
<th className="px-2 py-1">Source</th>
<th className="px-2 py-1">Rule</th>
<th className="px-2 py-1">Input</th>
<th className="px-2 py-1">Output</th>
</tr>
</thead>
<tbody>
{matches.map(m => (
<tr key={m.id} className="border-t border-gray-50">
<td className="px-2 py-1 font-mono text-gray-500">{m.source_name}</td>
<td className="px-2 py-1 font-mono text-gray-500">{m.rule_name}</td>
<td className="px-2 py-1 font-mono text-gray-700">
{typeof m.input_value === 'string' ? m.input_value : JSON.stringify(m.input_value)}
</td>
<td className="px-2 py-1 font-mono text-gray-700">
{Object.entries(m.output).map(([k, v]) => (
<span key={k} className={k === selected.col ? 'text-blue-600 font-semibold' : ''}>
{k}: {v}{' '}
</span>
))}
</td>
</tr>
))}
</tbody>
</table>
</div>
)}
</div>
)}
</div>
)
}

View File

@ -1,12 +1,42 @@
import { useState, useEffect, useRef } from 'react'
import { useSearchParams } from 'react-router-dom'
import { api } from '../api'
const FIELD_TYPES = ['text', 'numeric', 'date']
function SampleTable({ rows }) {
if (!rows || rows.length === 0) return null
const cols = Object.keys(rows[0])
return (
<div className="overflow-auto border border-gray-100 rounded bg-gray-50 max-h-36">
<table className="text-xs w-full">
<thead>
<tr className="text-left text-gray-400 border-b border-gray-100 bg-gray-50 sticky top-0">
{cols.map(c => <th key={c} className="px-2 py-1 font-medium whitespace-nowrap">{c}</th>)}
</tr>
</thead>
<tbody>
{rows.map((row, i) => (
<tr key={i} className="border-t border-gray-100">
{cols.map(c => (
<td key={c} className="px-2 py-1 whitespace-nowrap text-gray-600 max-w-32 truncate font-mono">
{row[c] == null ? <span className="text-gray-300"></span> : String(row[c])}
</td>
))}
</tr>
))}
</tbody>
</table>
</div>
)
}
export default function Sources({ source, sources, setSources, setSource }) {
const [dedup, setDedup] = useState('')
const [constraintFields, setConstraintFields] = useState('')
const [globalPicklist, setGlobalPicklist] = useState(true)
const [schemaFields, setSchemaFields] = useState([])
const [stats, setStats] = useState(null)
const [sampleRows, setSampleRows] = useState([])
const [saving, setSaving] = useState(false)
const [reprocessing, setReprocessing] = useState(false)
const [generating, setGenerating] = useState(false)
@ -14,25 +44,38 @@ export default function Sources({ source, sources, setSources, setSource }) {
const [error, setError] = useState('')
const [viewName, setViewName] = useState('')
const [availableFields, setAvailableFields] = useState([])
const [fieldSort, setFieldSort] = useState({ col: 'key', dir: 'asc' })
const [creating, setCreating] = useState(false)
const [form, setForm] = useState({ name: '', dedup_fields: '', fields: [], schema: [] })
const [form, setForm] = useState({ name: '', constraint_fields: '', fields: [], schema: [], importSample: true })
const [createError, setCreateError] = useState('')
const [createLoading, setCreateLoading] = useState(false)
const [csvFileName, setCsvFileName] = useState('')
const fileRef = useRef()
const [searchParams, setSearchParams] = useSearchParams()
const sourceObj = sources.find(s => s.name === source)
useEffect(() => {
if (searchParams.get('new') === '1') {
setCreating(true)
setSearchParams({})
}
}, [searchParams])
useEffect(() => {
if (!sourceObj) return
setDedup(sourceObj.dedup_fields?.join(', ') || '')
setConstraintFields(sourceObj.constraint_fields?.join(', ') || '')
setGlobalPicklist(sourceObj.global_picklist !== false)
setSchemaFields((sourceObj.config?.fields || []).map((f, i) => ({ seq: i + 1, ...f })))
setViewName(sourceObj.config?.fields?.length ? `dfv.${sourceObj.name}` : '')
setResult('')
setError('')
setStats(null)
setAvailableFields([])
setSampleRows([])
api.getStats(sourceObj.name).then(setStats).catch(() => {})
api.getFields(sourceObj.name).then(setAvailableFields).catch(() => {})
api.getRecords(sourceObj.name, 50).then(rows => setSampleRows(rows.map(r => r.data).filter(Boolean))).catch(() => {})
}, [source, sourceObj?.name])
async function handleSave(e) {
@ -40,10 +83,14 @@ export default function Sources({ source, sources, setSources, setSource }) {
setSaving(true)
setError('')
try {
const dedup_fields = dedup.split(',').map(s => s.trim()).filter(Boolean)
const constraint_fields = constraintFields.split(',').map(s => s.trim()).filter(Boolean)
const fields = [...schemaFields.filter(f => f.name)].sort((a, b) => (a.seq ?? 0) - (b.seq ?? 0))
const config = { ...(sourceObj.config || {}), fields }
await api.updateSource(sourceObj.name, { dedup_fields, config })
await api.updateSource(sourceObj.name, { constraint_fields, config, global_picklist: globalPicklist })
if (fields.length > 0) {
const res = await api.generateView(sourceObj.name)
if (res.success) setViewName(res.view)
}
const updated = await api.getSources()
setSources(updated)
setResult('Saved.')
@ -59,10 +106,10 @@ export default function Sources({ source, sources, setSources, setSource }) {
setResult('')
setError('')
try {
const dedup_fields = dedup.split(',').map(s => s.trim()).filter(Boolean)
const constraint_fields = constraintFields.split(',').map(s => s.trim()).filter(Boolean)
const fields = [...schemaFields.filter(f => f.name)].sort((a, b) => (a.seq ?? 0) - (b.seq ?? 0))
const config = { ...(sourceObj.config || {}), fields }
await api.updateSource(sourceObj.name, { dedup_fields, config })
await api.updateSource(sourceObj.name, { constraint_fields, config, global_picklist: globalPicklist })
const res = await api.generateView(sourceObj.name)
if (res.success) {
setViewName(res.view)
@ -109,13 +156,15 @@ export default function Sources({ source, sources, setSources, setSource }) {
async function handleSuggest(e) {
const file = e.target.files[0]
if (!file) return
setCsvFileName(file.name)
try {
const suggestion = await api.suggestSource(file)
setForm(f => ({
...f,
fields: suggestion.fields,
dedup_fields: '',
schema: suggestion.fields.map(f => ({ name: f.name, type: f.type }))
constraint_fields: '',
schema: suggestion.fields.map(f => ({ name: f.name, type: f.type, seq: suggestion.fields.indexOf(f) + 1 })),
sampleRows: suggestion.sampleRows || []
}))
} catch (err) {
setCreateError(err.message)
@ -125,19 +174,25 @@ export default function Sources({ source, sources, setSources, setSource }) {
async function handleCreate(e) {
e.preventDefault()
setCreateError('')
const dedupArr = form.dedup_fields.split(',').map(s => s.trim()).filter(Boolean)
if (!form.name || dedupArr.length === 0) {
setCreateError('Name and at least one dedup field required')
const constraintArr = form.constraint_fields.split(',').map(s => s.trim()).filter(Boolean)
if (!form.name || constraintArr.length === 0) {
setCreateError('Name and at least one constraint field required')
return
}
setCreateLoading(true)
try {
const config = form.schema.length > 0 ? { fields: form.schema } : {}
await api.createSource({ name: form.name, dedup_fields: dedupArr, config })
await api.createSource({ name: form.name, constraint_fields: constraintArr, config, global_picklist: form.global_picklist !== false })
if (form.schema.length > 0) {
await api.generateView(form.name)
}
if (form.importSample && fileRef.current?.files[0]) {
await api.importCSV(form.name, fileRef.current.files[0])
}
const updated = await api.getSources()
setSources(updated)
setSource(form.name)
setForm({ name: '', dedup_fields: '', fields: [], schema: [] })
setForm({ name: '', constraint_fields: '', fields: [], schema: [], importSample: true })
setCreating(false)
} catch (err) {
setCreateError(err.message)
@ -147,7 +202,7 @@ export default function Sources({ source, sources, setSources, setSource }) {
}
return (
<div className="p-6 max-w-2xl">
<div className="p-6 max-w-5xl">
<div className="flex items-center justify-between mb-6">
<h1 className="text-xl font-semibold text-gray-800">
{sourceObj ? sourceObj.name : 'Sources'}
@ -183,18 +238,45 @@ export default function Sources({ source, sources, setSources, setSource }) {
<table className="w-full text-xs">
<thead>
<tr className="text-left text-gray-400 border-b border-gray-100">
<th className="pb-1 font-medium">Key</th>
<th className="pb-1 font-medium">Origin</th>
<th className="pb-1 font-medium">Type</th>
<th className="pb-1 font-medium text-center">Dedup</th>
<th className="pb-1 font-medium text-center">In view</th>
<th className="pb-1 font-medium text-center">Seq</th>
{[
{ col: 'key', label: 'Key' },
{ col: 'origin', label: 'Origin' },
{ col: 'type', label: 'Type' },
{ col: 'constraint', label: 'Constraint', center: true },
{ col: 'inview', label: 'In view', center: true },
{ col: 'seq', label: 'Seq', center: true },
].map(({ col, label, center }) => (
<th
key={col}
onClick={() => setFieldSort(s => ({ col, dir: s.col === col && s.dir === 'asc' ? 'desc' : 'asc' }))}
className={`pb-1 font-medium cursor-pointer select-none hover:text-gray-600 ${center ? 'text-center' : ''}`}
>
{label}
<span className="ml-1 text-gray-300">
{fieldSort.col === col ? (fieldSort.dir === 'asc' ? '▲' : '▼') : '⇅'}
</span>
</th>
))}
</tr>
</thead>
<tbody>
{availableFields.map(f => {
{[...availableFields].sort((a, b) => {
const constraintList = constraintFields.split(',').map(s => s.trim())
const aSchema = schemaFields.find(sf => sf.name === a.key)
const bSchema = schemaFields.find(sf => sf.name === b.key)
let av, bv
if (fieldSort.col === 'key') { av = a.key; bv = b.key }
else if (fieldSort.col === 'origin') { av = a.origins.join(','); bv = b.origins.join(',') }
else if (fieldSort.col === 'type') { av = aSchema?.type || ''; bv = bSchema?.type || '' }
else if (fieldSort.col === 'constraint') { av = constraintList.includes(a.key) ? 0 : 1; bv = constraintList.includes(b.key) ? 0 : 1 }
else if (fieldSort.col === 'inview') { av = aSchema ? 0 : 1; bv = bSchema ? 0 : 1 }
else if (fieldSort.col === 'seq') { av = aSchema?.seq ?? 999; bv = bSchema?.seq ?? 999 }
if (av < bv) return fieldSort.dir === 'asc' ? -1 : 1
if (av > bv) return fieldSort.dir === 'asc' ? 1 : -1
return 0
}).map(f => {
const isRaw = f.origins.includes('raw')
const dedupChecked = dedup.split(',').map(s => s.trim()).includes(f.key)
const constraintChecked = constraintFields.split(',').map(s => s.trim()).includes(f.key)
const schemaEntry = schemaFields.find(sf => sf.name === f.key)
const inView = !!schemaEntry
return (
@ -228,13 +310,13 @@ export default function Sources({ source, sources, setSources, setSource }) {
{isRaw && (
<input
type="checkbox"
checked={dedupChecked}
checked={constraintChecked}
onChange={e => {
const current = dedup.split(',').map(s => s.trim()).filter(Boolean)
const current = constraintFields.split(',').map(s => s.trim()).filter(Boolean)
const next = e.target.checked
? [...current, f.key]
: current.filter(k => k !== f.key)
setDedup(next.join(', '))
setConstraintFields(next.join(', '))
}}
/>
)}
@ -273,7 +355,11 @@ export default function Sources({ source, sources, setSources, setSource }) {
</tbody>
</table>
<div className="flex items-center gap-3 pt-1">
<div className="flex items-center gap-3 pt-1 flex-wrap">
<label className="flex items-center gap-1.5 text-xs text-gray-500 cursor-pointer">
<input type="checkbox" checked={globalPicklist} onChange={e => setGlobalPicklist(e.target.checked)} />
Global picklist
</label>
<form onSubmit={handleSave}>
<button type="submit" disabled={saving}
className="text-sm bg-blue-600 text-white px-3 py-1.5 rounded hover:bg-blue-700 disabled:opacity-50">
@ -295,17 +381,24 @@ export default function Sources({ source, sources, setSources, setSource }) {
</>
)}
</div>
<SampleTable rows={sampleRows} />
</div>
)}
{/* Save button when no fields loaded yet */}
{availableFields.length === 0 && (
<form onSubmit={handleSave}>
<button type="submit" disabled={saving}
className="text-sm bg-blue-600 text-white px-3 py-1.5 rounded hover:bg-blue-700 disabled:opacity-50">
{saving ? 'Saving…' : 'Save'}
</button>
</form>
<div className="flex items-center gap-3">
<label className="flex items-center gap-1.5 text-xs text-gray-500 cursor-pointer">
<input type="checkbox" checked={globalPicklist} onChange={e => setGlobalPicklist(e.target.checked)} />
Global picklist
</label>
<form onSubmit={handleSave}>
<button type="submit" disabled={saving}
className="text-sm bg-blue-600 text-white px-3 py-1.5 rounded hover:bg-blue-700 disabled:opacity-50">
{saving ? 'Saving…' : 'Save'}
</button>
</form>
</div>
)}
{/* Reprocess */}
@ -337,8 +430,14 @@ export default function Sources({ source, sources, setSources, setSource }) {
<h2 className="text-sm font-semibold text-gray-700 mb-3">New source</h2>
<div className="mb-4">
<label className="text-xs text-gray-500 block mb-1">Upload a CSV to auto-detect fields</label>
<input type="file" accept=".csv" ref={fileRef} onChange={handleSuggest} className="text-sm text-gray-600" />
<input type="file" accept=".csv" ref={fileRef} onChange={handleSuggest} className="hidden" />
<button
type="button"
onClick={() => fileRef.current?.click()}
className="text-sm border border-gray-300 rounded px-3 py-1.5 text-gray-600 hover:bg-gray-50 hover:border-gray-400"
>
{csvFileName || 'Choose CSV…'}
</button>
</div>
<form onSubmit={handleCreate} className="space-y-3">
@ -353,53 +452,123 @@ export default function Sources({ source, sources, setSources, setSource }) {
</div>
{form.fields.length > 0 && (
<div>
<label className="text-xs text-gray-500 block mb-1">Detected fields check to use as dedup keys</label>
<div className="pt-2 border-t border-gray-100 space-y-2">
<table className="w-full text-xs">
<thead>
<tr className="text-left text-gray-400 border-b border-gray-100">
<th className="pb-1 font-medium">Field</th>
<th className="pb-1 font-medium">Key</th>
<th className="pb-1 font-medium">Type</th>
<th className="pb-1 font-medium text-center">Dedup</th>
<th className="pb-1 font-medium text-center">Constraint</th>
<th className="pb-1 font-medium text-center">In view</th>
<th className="pb-1 font-medium text-center">Seq</th>
</tr>
</thead>
<tbody>
{form.fields.map(f => (
<tr key={f.name} className="border-t border-gray-50">
<td className="py-1 font-mono text-gray-700">{f.name}</td>
<td className="py-1 text-gray-400">{f.type}</td>
<td className="py-1 text-center">
<input
type="checkbox"
checked={form.dedup_fields.split(',').map(s => s.trim()).includes(f.name)}
onChange={e => {
const current = form.dedup_fields.split(',').map(s => s.trim()).filter(Boolean)
const next = e.target.checked
? [...current, f.name]
: current.filter(n => n !== f.name)
setForm(ff => ({ ...ff, dedup_fields: next.join(', ') }))
}}
/>
</td>
</tr>
))}
{form.fields.map(f => {
const schemaEntry = form.schema.find(s => s.name === f.name)
const inView = !!schemaEntry
const currentType = schemaEntry?.type || f.type
return (
<tr key={f.name} className="border-t border-gray-50">
<td className="py-1 font-mono text-gray-700">{f.name}</td>
<td className="py-1">
{inView && (
<select
className="border border-gray-200 rounded px-1 py-0.5 text-xs focus:outline-none focus:border-blue-400"
value={currentType}
onChange={e => setForm(ff => ({
...ff,
schema: ff.schema.map(s => s.name === f.name ? { ...s, type: e.target.value } : s)
}))}
>
{FIELD_TYPES.map(t => <option key={t} value={t}>{t}</option>)}
</select>
)}
</td>
<td className="py-1 text-center">
<input
type="checkbox"
checked={form.constraint_fields.split(',').map(s => s.trim()).includes(f.name)}
onChange={e => {
const current = form.constraint_fields.split(',').map(s => s.trim()).filter(Boolean)
const next = e.target.checked
? [...current, f.name]
: current.filter(n => n !== f.name)
setForm(ff => ({ ...ff, constraint_fields: next.join(', ') }))
}}
/>
</td>
<td className="py-1 text-center">
<input
type="checkbox"
checked={inView}
onChange={e => {
if (e.target.checked) {
const nextSeq = form.schema.length > 0
? Math.max(...form.schema.map(s => s.seq ?? 0)) + 1
: 1
setForm(ff => ({ ...ff, schema: [...ff.schema, { name: f.name, type: f.type, seq: nextSeq }] }))
} else {
setForm(ff => ({ ...ff, schema: ff.schema.filter(s => s.name !== f.name) }))
}
}}
/>
</td>
<td className="py-1 text-center">
{inView && (
<input
type="number"
className="w-12 border border-gray-200 rounded px-1 py-0.5 text-xs text-center focus:outline-none focus:border-blue-400"
value={schemaEntry.seq ?? ''}
onChange={e => setForm(ff => ({
...ff,
schema: ff.schema.map(s => s.name === f.name ? { ...s, seq: parseInt(e.target.value) || 0 } : s)
}))}
/>
)}
</td>
</tr>
)
})}
</tbody>
</table>
<SampleTable rows={form.sampleRows || []} />
</div>
)}
{form.fields.length === 0 && (
<div>
<label className="text-xs text-gray-500 block mb-1">Dedup fields (comma-separated)</label>
<label className="text-xs text-gray-500 block mb-1">Constraint fields (comma-separated)</label>
<input
className="w-full border border-gray-200 rounded px-3 py-1.5 text-sm focus:outline-none focus:border-blue-400"
value={form.dedup_fields}
onChange={e => setForm(f => ({ ...f, dedup_fields: e.target.value }))}
value={form.constraint_fields}
onChange={e => setForm(f => ({ ...f, constraint_fields: e.target.value }))}
placeholder="e.g. date, amount, description"
/>
</div>
)}
<div className="flex gap-4">
<label className="flex items-center gap-1.5 text-xs text-gray-500 cursor-pointer">
<input
type="checkbox"
checked={form.global_picklist !== false}
onChange={e => setForm(f => ({ ...f, global_picklist: e.target.checked }))}
/>
Global picklist
</label>
{form.fields.length > 0 && (
<label className="flex items-center gap-1.5 text-xs text-gray-500 cursor-pointer">
<input
type="checkbox"
checked={form.importSample !== false}
onChange={e => setForm(f => ({ ...f, importSample: e.target.checked }))}
/>
Import sample data
</label>
)}
</div>
{createError && <p className="text-xs text-red-500">{createError}</p>}
<div className="flex gap-2">
@ -408,7 +577,7 @@ export default function Sources({ source, sources, setSources, setSource }) {
{createLoading ? 'Creating…' : 'Create'}
</button>
<button type="button"
onClick={() => { setCreating(false); setCreateError(''); setForm({ name: '', dedup_fields: '', fields: [], schema: [] }) }}
onClick={() => { setCreating(false); setCreateError(''); setForm({ name: '', constraint_fields: '', fields: [], schema: [] }) }}
className="text-sm text-gray-500 px-3 py-1.5 rounded hover:bg-gray-100">
Cancel
</button>