Compare commits

..

No commits in common. "24675feb496c50fd552b1b0419e5b414322018f4" and "b2a5e3c92a0f1bd25fe5005bed5d517b54acd02a" have entirely different histories.

21 changed files with 362 additions and 1967 deletions

View File

@ -19,7 +19,7 @@ Dataflow is a simple data transformation tool for importing, cleaning, and stand
### Database Schema (`database/schema.sql`) ### Database Schema (`database/schema.sql`)
**5 simple tables:** **5 simple tables:**
- `sources` - Source definitions with `constraint_fields` array - `sources` - Source definitions with `dedup_fields` array
- `records` - Imported data with `data` (raw) and `transformed` (enriched) JSONB columns - `records` - Imported data with `data` (raw) and `transformed` (enriched) JSONB columns
- `rules` - Regex extraction rules with `field`, `pattern`, `output_field` - `rules` - Regex extraction rules with `field`, `pattern`, `output_field`
- `mappings` - Input/output value mappings - `mappings` - Input/output value mappings
@ -123,11 +123,9 @@ records.data → apply_transformations() →
``` ```
### Deduplication ### Deduplication
- `constraint_key` is a JSONB object of the constraint field values (readable, no hashing) - Hash is MD5 of concatenated values from `dedup_fields`
- Dedup is enforced at import time via CTE — no unique DB constraint - Unique constraint on `(source_name, dedup_key)` prevents duplicates
- Intra-file duplicate rows are allowed (bank may send identical rows); they all insert - Import function catches unique violations and counts them
- On re-import, all rows whose constraint_key already exists in the DB are skipped
- Deleting an import log entry cascades to all records from that batch (import_id FK)
### Error Handling ### Error Handling
- API routes use `try/catch` and pass errors to `next(err)` - API routes use `try/catch` and pass errors to `next(err)`
@ -186,7 +184,7 @@ The simplification makes it easy to understand, modify, and maintain.
- Check for SQL errors in logs - Check for SQL errors in logs
**All records marked as duplicates:** **All records marked as duplicates:**
- Verify `constraint_fields` match actual field names in data - Verify `dedup_fields` match actual field names in data
- Check if data was already imported - Check if data was already imported
- Use different source name for testing - Use different source name for testing

44
SPEC.md
View File

@ -61,8 +61,6 @@ ui/
Rules.jsx — rule CRUD with live pattern preview Rules.jsx — rule CRUD with live pattern preview
Mappings.jsx — mapping table with TSV import/export Mappings.jsx — mapping table with TSV import/export
Records.jsx — paginated, sortable view of transformed records Records.jsx — paginated, sortable view of transformed records
Pivot.jsx — interactive pivot table with cell inspector
Log.jsx — global import log across all sources
public/ — compiled UI (output of npm run build in ui/) public/ — compiled UI (output of npm run build in ui/)
``` ```
@ -73,10 +71,10 @@ public/ — compiled UI (output of npm run build in ui/)
Five tables in the `dataflow` schema: Five tables in the `dataflow` schema:
### `sources` ### `sources`
Defines a data source. The `constraint_fields` array specifies which fields make a record unique. `config` (JSONB) holds the output schema (`fields` array) used to generate the typed view. Defines a data source. The `dedup_fields` array specifies which fields make a record unique. `config` (JSONB) holds the output schema (`fields` array) used to generate the typed view.
### `records` ### `records`
Stores every imported record. `data` holds the raw import. `transformed` holds the enriched record after rules and mappings are applied. `constraint_key` is a JSONB object of the constraint field values used to detect duplicates at import time. `import_id` references the `import_log` row; deleting a log entry cascades to its records. Stores every imported record. `data` holds the raw import. `transformed` holds the enriched record after rules and mappings are applied. `dedup_key` is an MD5 hash of the dedup fields — a unique constraint on `(source_name, dedup_key)` prevents duplicate imports.
### `rules` ### `rules`
Regex transformation rules. Each rule reads from `field`, applies `pattern` with optional `flags`, and writes to `output_field`. `function_type` is either `extract` (regexp_matches) or `replace` (regexp_replace). `sequence` controls the order rules are applied. `retain` keeps the raw extracted value in `output_field` even when a mapping overrides it. Regex transformation rules. Each rule reads from `field`, applies `pattern` with optional `flags`, and writes to `output_field`. `function_type` is either `extract` (regexp_matches) or `replace` (regexp_replace). `sequence` controls the order rules are applied. `retain` keeps the raw extracted value in `output_field` even when a mapping overrides it.
@ -85,7 +83,7 @@ Regex transformation rules. Each rule reads from `field`, applies `pattern` with
Maps an extracted value to a standardized output object. `input_value` is JSONB (matches the extracted value exactly, including arrays from multi-capture-group patterns). `output` is a JSONB object that can contain multiple fields (e.g., `{"vendor": "Walmart", "category": "Groceries"}`). Maps an extracted value to a standardized output object. `input_value` is JSONB (matches the extracted value exactly, including arrays from multi-capture-group patterns). `output` is a JSONB object that can contain multiple fields (e.g., `{"vendor": "Walmart", "category": "Groceries"}`).
### `import_log` ### `import_log`
Audit trail. One row per import call, recording how many records were inserted versus skipped as duplicates. `info` (JSONB) stores the full `inserted_keys` and `excluded_keys` arrays. Deleting a log row cascades to its records via the `import_id` FK. Audit trail. One row per import call, recording how many records were inserted versus skipped as duplicates.
--- ---
@ -94,11 +92,8 @@ Audit trail. One row per import call, recording how many records were inserted v
### Import ### Import
``` ```
CSV file → parse in Node.js → import_records(source, data) CSV file → parse in Node.js → import_records(source, data)
→ build JSONB constraint_key per record → generate_dedup_key() per record → INSERT with unique constraint
→ compare against existing records (CTE — no unique constraint) → count inserted vs duplicates → log to import_log
→ INSERT new records, skip duplicates
→ log to import_log (with inserted_keys / excluded_keys)
→ apply_transformations() runs automatically on new records
``` ```
### Transform ### Transform
@ -150,7 +145,7 @@ All routes are under `/api`. Every route requires HTTP Basic Auth. The `GET /hea
| GET | /api/sources | List all sources | | GET | /api/sources | List all sources |
| POST | /api/sources | Create source | | POST | /api/sources | Create source |
| GET | /api/sources/:name | Get source | | GET | /api/sources/:name | Get source |
| PUT | /api/sources/:name | Update source (constraint_fields, config) | | PUT | /api/sources/:name | Update source (dedup_fields, config) |
| DELETE | /api/sources/:name | Delete source and all data | | DELETE | /api/sources/:name | Delete source and all data |
| POST | /api/sources/suggest | Suggest source config from CSV upload | | POST | /api/sources/suggest | Suggest source config from CSV upload |
| POST | /api/sources/:name/import | Import CSV records | | POST | /api/sources/:name/import | Import CSV records |
@ -208,36 +203,15 @@ Built with React + Vite + Tailwind CSS. Compiled output goes to `public/`. The s
**Pages:** **Pages:**
- **Sources** — View and edit source configuration. Shows all known field names and their origins (raw data, schema, rules, mappings). Checkboxes control which fields are constraint fields and which appear in the output view. Supports CSV upload to auto-detect fields. - **Sources** — View and edit source configuration. Shows all known field names and their origins (raw data, schema, rules, mappings). Checkboxes control which fields are dedup keys and which appear in the output view. Supports CSV upload to auto-detect fields.
- **Import** — Upload a CSV to import records into the selected source. Transformations run automatically on new records. Shows import log with inserted/duplicate counts, expandable key detail, checkbox selection, and delete with confirmation. - **Import** — Upload a CSV to import records into the selected source. Shows import log with inserted/duplicate counts per import.
- **Rules** — Create and manage regex rules. Live preview fires automatically (debounced 500ms) as pattern/field/flags are edited, showing match results against real records. Rules can be enabled/disabled by toggle. - **Rules** — Create and manage regex rules. Live preview fires automatically (debounced 500ms) as pattern/field/flags are edited, showing match results against real records. Rules can be enabled/disabled by toggle.
- **Mappings** — Tabular mapping editor. Shows all extracted values from transformed records with record counts and sample raw data. Rows are yellow (unmapped), white (mapped), or blue (edited but unsaved). Supports TSV export and import. Columns can be added dynamically. - **Mappings** — Tabular mapping editor. Shows all extracted values from transformed records with record counts and sample raw data. Rows are yellow (unmapped), white (mapped), or blue (edited but unsaved). Supports TSV export and import. Columns can be added dynamically.
- **Records** — Paginated table showing the `dfv.{source}` view. Server-side sorting (column validated against `information_schema.columns`, interpolated with `quote_ident`). Dates are formatted `YYYY-MM-DD` for correct lexicographic sort. Regex filters can be added per column. If the view cast fails (e.g. a field typed as `date` contains text), the error is shown inline rather than a blank page. - **Records** — Paginated table showing the `dfv.{source}` view. Server-side sorting (column validated against `information_schema.columns`, interpolated with `quote_ident`). Dates are formatted `YYYY-MM-DD` for correct lexicographic sort.
- **Pivot** — Interactive pivot/crosstab powered by [Perspective](https://perspective.finos.org/) (`@perspective-dev` v4.4.0, loaded from CDN at runtime). Loads all rows from the source view into an in-browser Perspective worker and renders a `<perspective-viewer>` web component. Supports grouping, splitting, filtering, sorting, and charting interactively.
**Toolbar (above the viewer):**
- Named layouts — saved per source in the `pivot_layouts` DB table. Each chip recalls the full viewer state including group_by, split_by, filters, expressions, selection mode, and expand depth. A blue **Save** button overwrites the active layout in place; **+ Save as…** saves to a new name. The × on each chip deletes it.
- **depth: 0 1 2 3** — collapses or expands all grouped rows to the specified hierarchy level. Implemented via `view.set_depth(d)` + `plugin.draw(view)` (the only working mechanism found in v4.4.0 — `plugin_config.expand_depth` and `viewer.flush()` alone have no effect).
- The Perspective built-in **selection mode button** (Read-Only / Select Row / Select Column / Select Region) defaults to **Select Region** on fresh load, set directly via `plugin.restore({ edit_mode: 'SELECT_REGION' })` after the viewer loads.
**Cell inspector (right panel):**
- Opens when a cell is clicked and a `group_by` hierarchy is active. If there is no `group_by`, the click is ignored — without coordinate filters the query would return the full dataset.
- Row filtering uses a temporary Perspective view (`table.view({ filter: eventFilters, expressions: config.expressions })`) so that computed/expression columns in `split_by` are evaluated correctly. Falls back to JS-side filtering if the view query fails.
- Shows cell coordinates (group_by split_by values), the clicked metric with value, any user-set filters, and a table of matching raw rows.
- Number formatting rounds to 2 decimal places by default; a /+ control in the inspector header adjusts precision (08).
**Layout persistence:**
- `localStorage` key `psp_layout_{source}` saves the last viewer state on each named layout save.
- Named layouts store `{ ...viewer.save(), plugin_config: plugin.save(), expand_depth }` as JSONB in `pivot_layouts`. On recall, viewer config, plugin config (edit mode), and expand depth are all restored independently.
See `docs/perspective-pivot.md` for the full technical reference on controlling Perspective programmatically.
- **Log** — Global import log across all sources. Same expandable key detail and delete capability as the Import page, plus a source name column.
--- ---

View File

@ -39,59 +39,6 @@ module.exports = (pool) => {
} }
}); });
// Get global output values (for autocomplete across all global_picklist=true sources)
router.get('/global-values', async (req, res, next) => {
try {
const result = await pool.query(`SELECT * FROM get_global_output_values()`);
const map = {};
for (const { col, val } of result.rows) {
if (!map[col]) map[col] = [];
map[col].push(val);
}
res.json(map);
} catch (err) {
next(err);
}
});
// Search output field values across all mappings (for global remap)
router.get('/outputs', async (req, res, next) => {
try {
const { search = '' } = req.query;
const result = await pool.query(`SELECT * FROM search_mapping_outputs(${lit(search)})`);
res.json(result.rows);
} catch (err) {
next(err);
}
});
// Get individual mappings for a specific output field value
router.get('/outputs/:col/:val', async (req, res, next) => {
try {
const result = await pool.query(
`SELECT * FROM get_mappings_by_output_field(${lit(req.params.col)}, ${lit(req.params.val)})`
);
res.json(result.rows);
} catch (err) {
next(err);
}
});
// Remap a field value globally across all mappings
router.post('/remap-field', async (req, res, next) => {
try {
const { col, from_val, to_val } = req.body;
if (!col || from_val == null || to_val == null)
return res.status(400).json({ error: 'col, from_val, and to_val are required' });
const result = await pool.query(
`SELECT remap_output_field(${lit(col)}, ${lit(from_val)}, ${lit(to_val)}) AS updated`
);
res.json({ updated: result.rows[0].updated });
} catch (err) {
next(err);
}
});
// Get unmapped values // Get unmapped values
router.get('/source/:source_name/unmapped', async (req, res, next) => { router.get('/source/:source_name/unmapped', async (req, res, next) => {
try { try {

View File

@ -73,9 +73,7 @@ module.exports = (pool) => {
const result = await pool.query( const result = await pool.query(
`SELECT * FROM create_rule(${lit(source_name)}, ${lit(name)}, ${lit(field)}, ${lit(pattern)}, ${lit(output_field)}, ${lit(function_type || 'extract')}, ${lit(flags || '')}, ${lit(replace_value || '')}, ${lit(enabled !== false)}, ${lit(retain === true)}, ${lit(sequence || 0)})` `SELECT * FROM create_rule(${lit(source_name)}, ${lit(name)}, ${lit(field)}, ${lit(pattern)}, ${lit(output_field)}, ${lit(function_type || 'extract')}, ${lit(flags || '')}, ${lit(replace_value || '')}, ${lit(enabled !== false)}, ${lit(retain === true)}, ${lit(sequence || 0)})`
); );
const rule = result.rows[0]; res.status(201).json(result.rows[0]);
await pool.query(`SELECT reprocess_records(${lit(source_name)})`);
res.status(201).json(rule);
} catch (err) { } catch (err) {
if (err.code === '23505') return res.status(409).json({ error: 'Rule already exists for this source' }); if (err.code === '23505') return res.status(409).json({ error: 'Rule already exists for this source' });
if (err.code === '23503') return res.status(404).json({ error: 'Source not found' }); if (err.code === '23503') return res.status(404).json({ error: 'Source not found' });
@ -95,9 +93,7 @@ module.exports = (pool) => {
`SELECT * FROM update_rule(${lit(parseInt(req.params.id))}, ${n(name)}, ${n(field)}, ${n(pattern)}, ${n(output_field)}, ${n(function_type)}, ${n(flags)}, ${n(replace_value)}, ${n(enabled)}, ${n(retain)}, ${n(sequence)})` `SELECT * FROM update_rule(${lit(parseInt(req.params.id))}, ${n(name)}, ${n(field)}, ${n(pattern)}, ${n(output_field)}, ${n(function_type)}, ${n(flags)}, ${n(replace_value)}, ${n(enabled)}, ${n(retain)}, ${n(sequence)})`
); );
if (result.rows.length === 0) return res.status(404).json({ error: 'Rule not found' }); if (result.rows.length === 0) return res.status(404).json({ error: 'Rule not found' });
const rule = result.rows[0]; res.json(result.rows[0]);
await pool.query(`SELECT reprocess_records(${lit(rule.source_name)})`);
res.json(rule);
} catch (err) { } catch (err) {
next(err); next(err);
} }

View File

@ -52,22 +52,19 @@ module.exports = (pool) => {
const records = parse(req.file.buffer, { columns: true, skip_empty_lines: true, trim: true }); const records = parse(req.file.buffer, { columns: true, skip_empty_lines: true, trim: true });
if (records.length === 0) return res.status(400).json({ error: 'CSV file is empty' }); if (records.length === 0) return res.status(400).json({ error: 'CSV file is empty' });
const ISO_DATE_RE = /^\d{4}-\d{2}-\d{2}(T[\d:.Z+-]+)?$/;
const sample = records[0]; const sample = records[0];
const sampleRows = records.slice(0, 50);
const fields = Object.keys(sample).map(key => { const fields = Object.keys(sample).map(key => {
const vals = sampleRows.map(r => r[key]).filter(v => v !== '' && v != null); const val = sample[key];
let type = 'text'; let type = 'text';
if (vals.length > 0 && vals.every(v => !isNaN(parseFloat(v)) && isFinite(v) && String(v).charAt(0) !== '0')) { if (!isNaN(parseFloat(val)) && isFinite(val) && val.charAt(0) !== '0') {
type = 'numeric'; type = 'numeric';
} else if (vals.length > 0 && vals.every(v => ISO_DATE_RE.test(String(v)))) { } else if (Date.parse(val) > Date.parse('1950-01-01') && Date.parse(val) < Date.parse('2050-01-01')) {
type = 'date'; type = 'date';
} }
return { name: key, type }; return { name: key, type };
}); });
res.json({ name: '', constraint_fields: [], fields, sampleRows }); res.json({ name: '', dedup_fields: [], fields });
} catch (err) { } catch (err) {
next(err); next(err);
} }
@ -76,12 +73,12 @@ module.exports = (pool) => {
// Create source // Create source
router.post('/', async (req, res, next) => { router.post('/', async (req, res, next) => {
try { try {
const { name, constraint_fields, config, global_picklist } = req.body; const { name, dedup_fields, config } = req.body;
if (!name || !constraint_fields || !Array.isArray(constraint_fields)) { if (!name || !dedup_fields || !Array.isArray(dedup_fields)) {
return res.status(400).json({ error: 'Missing required fields: name, constraint_fields (array)' }); return res.status(400).json({ error: 'Missing required fields: name, dedup_fields (array)' });
} }
const result = await pool.query( const result = await pool.query(
`SELECT * FROM create_source(${lit(name)}, ${arr(constraint_fields)}, ${lit(config || {})}, ${lit(global_picklist !== false)})` `SELECT * FROM create_source(${lit(name)}, ${arr(dedup_fields)}, ${lit(config || {})})`
); );
res.status(201).json(result.rows[0]); res.status(201).json(result.rows[0]);
} catch (err) { } catch (err) {
@ -93,10 +90,9 @@ module.exports = (pool) => {
// Update source // Update source
router.put('/:name', async (req, res, next) => { router.put('/:name', async (req, res, next) => {
try { try {
const { constraint_fields, config, global_picklist } = req.body; const { dedup_fields, config } = req.body;
const gpVal = global_picklist !== undefined ? lit(global_picklist) : 'NULL';
const result = await pool.query( const result = await pool.query(
`SELECT * FROM update_source(${lit(req.params.name)}, ${constraint_fields ? arr(constraint_fields) : 'NULL'}, ${config ? lit(config) : 'NULL'}, ${gpVal})` `SELECT * FROM update_source(${lit(req.params.name)}, ${dedup_fields ? arr(dedup_fields) : 'NULL'}, ${config ? lit(config) : 'NULL'})`
); );
if (result.rows.length === 0) return res.status(404).json({ error: 'Source not found' }); if (result.rows.length === 0) return res.status(404).json({ error: 'Source not found' });
res.json(result.rows[0]); res.json(result.rows[0]);
@ -126,8 +122,6 @@ module.exports = (pool) => {
); );
const importData = importResult.rows[0].result; const importData = importResult.rows[0].result;
if (!importData.success) return res.json(importData);
const transformResult = await pool.query( const transformResult = await pool.query(
`SELECT apply_transformations(${lit(req.params.name)}) as result` `SELECT apply_transformations(${lit(req.params.name)}) as result`
); );
@ -216,13 +210,9 @@ module.exports = (pool) => {
// Get view data (paginated, sortable) // Get view data (paginated, sortable)
router.get('/:name/view-data', async (req, res, next) => { router.get('/:name/view-data', async (req, res, next) => {
try { try {
const { limit = 100, offset = 0, sort_col, sort_dir, filters } = req.query; const { limit = 100, offset = 0, sort_col, sort_dir } = req.query;
let parsedFilters = null;
if (filters) {
try { parsedFilters = JSON.parse(filters); } catch { /* ignore bad JSON */ }
}
const result = await pool.query( const result = await pool.query(
`SELECT get_view_data(${lit(req.params.name)}, ${lit(parseInt(limit))}, ${lit(parseInt(offset))}, ${lit(sort_col || null)}, ${lit(sort_dir || 'asc')}, ${parsedFilters ? lit(parsedFilters) : 'NULL'}) as result` `SELECT get_view_data(${lit(req.params.name)}, ${lit(parseInt(limit))}, ${lit(parseInt(offset))}, ${lit(sort_col || null)}, ${lit(sort_dir || 'asc')}) as result`
); );
res.json(result.rows[0].result); res.json(result.rows[0].result);
} catch (err) { } catch (err) {
@ -230,32 +220,5 @@ module.exports = (pool) => {
} }
}); });
// Pivot layouts
router.get('/:name/layouts', async (req, res, next) => {
try {
const result = await pool.query(`SELECT * FROM list_pivot_layouts(${lit(req.params.name)})`);
res.json(result.rows);
} catch (err) { next(err); }
});
router.post('/:name/layouts', async (req, res, next) => {
try {
const { layout_name, config } = req.body;
if (!layout_name || !config) return res.status(400).json({ error: 'layout_name and config required' });
const result = await pool.query(
`SELECT * FROM save_pivot_layout(${lit(req.params.name)}, ${lit(layout_name)}, ${lit(config)})`
);
res.json(result.rows[0]);
} catch (err) { next(err); }
});
router.delete('/:name/layouts/:id', async (req, res, next) => {
try {
const result = await pool.query(`SELECT * FROM delete_pivot_layout(${lit(parseInt(req.params.id))})`);
if (result.rows.length === 0) return res.status(404).json({ error: 'Layout not found' });
res.json({ success: true });
} catch (err) { next(err); }
});
return router; return router;
}; };

View File

@ -14,16 +14,17 @@ CREATE OR REPLACE FUNCTION import_records(
p_data JSONB -- Array of records p_data JSONB -- Array of records
) RETURNS JSON AS $$ ) RETURNS JSON AS $$
DECLARE DECLARE
v_constraint_fields TEXT[]; v_dedup_fields TEXT[];
v_inserted INTEGER; v_inserted INTEGER;
v_duplicates INTEGER; v_duplicates INTEGER;
v_log_id INTEGER; v_log_id INTEGER;
BEGIN BEGIN
SELECT constraint_fields INTO v_constraint_fields -- Get dedup fields for this source
SELECT dedup_fields INTO v_dedup_fields
FROM dataflow.sources FROM dataflow.sources
WHERE name = p_source_name; WHERE name = p_source_name;
IF v_constraint_fields IS NULL THEN IF v_dedup_fields IS NULL THEN
RETURN json_build_object( RETURN json_build_object(
'success', false, 'success', false,
'error', 'Source not found: ' || p_source_name 'error', 'Source not found: ' || p_source_name
@ -31,49 +32,52 @@ BEGIN
END IF; END IF;
WITH WITH
-- All incoming records with their constraint keys -- All incoming records with their dedup keys and readable field values
pending AS ( pending AS (
SELECT SELECT
rec.value AS data, rec.value AS data,
rec.ordinality AS seq, rec.ordinality AS seq,
dataflow.generate_dedup_key(rec.value, v_dedup_fields) AS dedup_key,
(SELECT jsonb_object_agg(f, rec.value->>f) (SELECT jsonb_object_agg(f, rec.value->>f)
FROM unnest(v_constraint_fields) AS f) AS constraint_key FROM unnest(v_dedup_fields) AS f) AS dedup_values
FROM jsonb_array_elements(p_data) WITH ORDINALITY AS rec FROM jsonb_array_elements(p_data) WITH ORDINALITY AS rec
), ),
-- Keys already in the database (excluded) -- Keys already in the database (excluded) with their readable values
existing AS ( existing AS (
SELECT DISTINCT r.constraint_key SELECT DISTINCT ON (r.dedup_key) r.dedup_key,
(SELECT jsonb_object_agg(f, r.data->>f)
FROM unnest(v_dedup_fields) AS f) AS dedup_values
FROM dataflow.records r FROM dataflow.records r
INNER JOIN pending p ON p.constraint_key = r.constraint_key INNER JOIN pending p ON p.dedup_key = r.dedup_key
WHERE r.source_name = p_source_name WHERE r.source_name = p_source_name
), ),
-- Rows whose constraint key is not yet in the database -- Keys that are new
new_records AS ( new_keys AS (
SELECT p.data, p.constraint_key, p.seq SELECT p.dedup_key, p.dedup_values FROM pending p
FROM pending p WHERE NOT EXISTS (SELECT 1 FROM existing e WHERE e.dedup_key = p.dedup_key)
WHERE NOT EXISTS (SELECT 1 FROM existing e WHERE e.constraint_key = p.constraint_key)
), ),
-- Write the log entry -- Write the log entry with readable field values instead of hashes
log_entry AS ( log_entry AS (
INSERT INTO dataflow.import_log (source_name, records_imported, records_duplicate, info) INSERT INTO dataflow.import_log (source_name, records_imported, records_duplicate, info)
VALUES ( VALUES (
p_source_name, p_source_name,
(SELECT count(*) FROM new_records), (SELECT count(*) FROM new_keys),
(SELECT count(*) FROM pending) - (SELECT count(*) FROM new_records), (SELECT count(*) FROM existing),
jsonb_build_object( jsonb_build_object(
'total', jsonb_array_length(p_data), 'total', jsonb_array_length(p_data),
'inserted_keys', (SELECT jsonb_agg(constraint_key ORDER BY constraint_key) FROM new_records), 'inserted_keys', (SELECT jsonb_agg(dedup_values) FROM new_keys),
'excluded_keys', (SELECT jsonb_agg(constraint_key) FROM existing) 'excluded_keys', (SELECT jsonb_agg(dedup_values) FROM existing)
) )
) )
RETURNING id, records_imported, records_duplicate RETURNING id, records_imported, records_duplicate
), ),
-- Insert new records -- Insert only new records
inserted AS ( inserted AS (
INSERT INTO dataflow.records (source_name, data, constraint_key, import_id) INSERT INTO dataflow.records (source_name, data, dedup_key, import_id)
SELECT p_source_name, nr.data, nr.constraint_key, (SELECT id FROM log_entry) SELECT p_source_name, p.data, p.dedup_key, (SELECT id FROM log_entry)
FROM new_records nr FROM pending p
ORDER BY nr.seq INNER JOIN new_keys nk ON nk.dedup_key = p.dedup_key
ORDER BY p.seq
RETURNING id RETURNING id
) )
SELECT le.id, le.records_imported, le.records_duplicate SELECT le.id, le.records_imported, le.records_duplicate

View File

@ -19,14 +19,14 @@ CREATE EXTENSION IF NOT EXISTS dblink;
\echo '' \echo ''
\echo '=== 1. Sources ===' \echo '=== 1. Sources ==='
INSERT INTO dataflow.sources (name, constraint_fields, config) INSERT INTO dataflow.sources (name, dedup_fields, config)
SELECT SELECT
srce AS name, srce AS name,
-- Strip {} wrappers from constraint paths → constraint field names -- Strip {} wrappers from constraint paths → dedup field names
ARRAY( ARRAY(
SELECT regexp_replace(c, '^\{|\}$', '', 'g') SELECT regexp_replace(c, '^\{|\}$', '', 'g')
FROM jsonb_array_elements_text(defn->'constraint') AS c FROM jsonb_array_elements_text(defn->'constraint') AS c
) AS constraint_fields, ) AS dedup_fields,
-- Build config.fields from the first schema (index 0 = "mapped" for dcard, "default" for others) -- Build config.fields from the first schema (index 0 = "mapped" for dcard, "default" for others)
jsonb_build_object('fields', jsonb_build_object('fields',
(SELECT jsonb_agg( (SELECT jsonb_agg(
@ -44,7 +44,7 @@ FROM dblink(:'tps_conn',
) AS t(srce TEXT, defn JSONB) ) AS t(srce TEXT, defn JSONB)
ON CONFLICT (name) DO NOTHING; ON CONFLICT (name) DO NOTHING;
SELECT name, constraint_fields, jsonb_array_length(config->'fields') AS field_count SELECT name, dedup_fields, jsonb_array_length(config->'fields') AS field_count
FROM dataflow.sources ORDER BY name; FROM dataflow.sources ORDER BY name;
\echo '' \echo ''
@ -95,11 +95,11 @@ FROM dataflow.mappings GROUP BY source_name, rule_name ORDER BY source_name, rul
\echo '=== 4. Records ===' \echo '=== 4. Records ==='
\echo ' (13 000+ rows — may take a moment)' \echo ' (13 000+ rows — may take a moment)'
INSERT INTO dataflow.records (source_name, data, constraint_key, transformed, imported_at, transformed_at) INSERT INTO dataflow.records (source_name, data, dedup_key, transformed, imported_at, transformed_at)
SELECT SELECT
t.srce AS source_name, t.srce AS source_name,
t.rec AS data, t.rec AS data,
(SELECT jsonb_object_agg(f, t.rec->>f) FROM unnest(s.constraint_fields) AS f) AS constraint_key, dataflow.generate_dedup_key(t.rec, s.dedup_fields) AS dedup_key,
t.allj AS transformed, t.allj AS transformed,
CURRENT_TIMESTAMP AS imported_at, CURRENT_TIMESTAMP AS imported_at,
CASE WHEN t.allj IS NOT NULL THEN CURRENT_TIMESTAMP END AS transformed_at CASE WHEN t.allj IS NOT NULL THEN CURRENT_TIMESTAMP END AS transformed_at
@ -107,7 +107,7 @@ FROM dblink(:'tps_conn',
'SELECT srce, rec, allj FROM tps.trans' 'SELECT srce, rec, allj FROM tps.trans'
) AS t(srce TEXT, rec JSONB, allj JSONB) ) AS t(srce TEXT, rec JSONB, allj JSONB)
JOIN dataflow.sources s ON s.name = t.srce JOIN dataflow.sources s ON s.name = t.srce
ON CONFLICT (source_name, constraint_key) DO NOTHING; ON CONFLICT (source_name, dedup_key) DO NOTHING;
SELECT source_name, COUNT(*) AS records, COUNT(transformed) AS transformed SELECT source_name, COUNT(*) AS records, COUNT(transformed) AS transformed
FROM dataflow.records GROUP BY source_name ORDER BY source_name; FROM dataflow.records GROUP BY source_name ORDER BY source_name;

View File

@ -206,56 +206,3 @@ BEGIN
ORDER BY count(*) DESC; ORDER BY count(*) DESC;
END; END;
$$ LANGUAGE plpgsql; $$ LANGUAGE plpgsql;
-- ── Global picklist ───────────────────────────────────────────────────────────
CREATE OR REPLACE FUNCTION get_global_output_values()
RETURNS TABLE (col TEXT, val TEXT) AS $$
SELECT DISTINCT e.key AS col, e.value AS val
FROM dataflow.mappings m
JOIN dataflow.sources s ON s.name = m.source_name
CROSS JOIN LATERAL jsonb_each_text(m.output) AS e(key, value)
WHERE s.global_picklist = true
AND e.value IS NOT NULL
AND e.value <> ''
ORDER BY e.key, e.value;
$$ LANGUAGE sql STABLE;
-- ── Remap output field values ─────────────────────────────────────────────────
-- Search for distinct (field, value) pairs across all mapping outputs
CREATE OR REPLACE FUNCTION search_mapping_outputs(p_search TEXT)
RETURNS TABLE (col TEXT, val TEXT, mapping_count BIGINT) AS $$
SELECT e.key AS col, e.value AS val, COUNT(*) AS mapping_count
FROM dataflow.mappings m
CROSS JOIN LATERAL jsonb_each_text(m.output) AS e(key, value)
WHERE e.value ILIKE '%' || p_search || '%'
AND e.value IS NOT NULL
AND e.value <> ''
GROUP BY e.key, e.value
ORDER BY e.key, e.value;
$$ LANGUAGE sql STABLE;
-- Get individual mappings matching a specific output field value
CREATE OR REPLACE FUNCTION get_mappings_by_output_field(p_col TEXT, p_val TEXT)
RETURNS TABLE (id INT, source_name TEXT, rule_name TEXT, input_value JSONB, output JSONB) AS $$
SELECT m.id, m.source_name, m.rule_name, m.input_value, m.output
FROM dataflow.mappings m
WHERE m.output->>(p_col) = p_val
ORDER BY m.source_name, m.rule_name, m.input_value::text;
$$ LANGUAGE sql STABLE;
-- Replace a specific field value across all matching mappings
CREATE OR REPLACE FUNCTION remap_output_field(p_col TEXT, p_from_val TEXT, p_to_val TEXT)
RETURNS INTEGER AS $$
DECLARE
updated_count INTEGER;
BEGIN
UPDATE dataflow.mappings
SET output = jsonb_set(output, ARRAY[p_col], to_jsonb(p_to_val))
WHERE output->>(p_col) = p_from_val;
GET DIAGNOSTICS updated_count = ROW_COUNT;
RETURN updated_count;
END;
$$ LANGUAGE plpgsql;

View File

@ -85,7 +85,7 @@ CREATE OR REPLACE FUNCTION preview_rule(
p_replace_value TEXT DEFAULT '', p_replace_value TEXT DEFAULT '',
p_limit INT DEFAULT 20 p_limit INT DEFAULT 20
) )
RETURNS TABLE (id INT, raw_value TEXT, extracted_value JSONB) AS $$ RETURNS TABLE (id BIGINT, raw_value TEXT, extracted_value JSONB) AS $$
BEGIN BEGIN
IF p_function_type = 'replace' THEN IF p_function_type = 'replace' THEN
RETURN QUERY RETURN QUERY

View File

@ -17,20 +17,19 @@ RETURNS dataflow.sources AS $$
SELECT * FROM dataflow.sources WHERE name = p_name; SELECT * FROM dataflow.sources WHERE name = p_name;
$$ LANGUAGE sql STABLE; $$ LANGUAGE sql STABLE;
CREATE OR REPLACE FUNCTION create_source(p_name TEXT, p_constraint_fields TEXT[], p_config JSONB DEFAULT '{}', p_global_picklist BOOLEAN DEFAULT true) CREATE OR REPLACE FUNCTION create_source(p_name TEXT, p_dedup_fields TEXT[], p_config JSONB DEFAULT '{}')
RETURNS dataflow.sources AS $$ RETURNS dataflow.sources AS $$
INSERT INTO dataflow.sources (name, constraint_fields, config, global_picklist) INSERT INTO dataflow.sources (name, dedup_fields, config)
VALUES (p_name, p_constraint_fields, p_config, p_global_picklist) VALUES (p_name, p_dedup_fields, p_config)
RETURNING *; RETURNING *;
$$ LANGUAGE sql; $$ LANGUAGE sql;
CREATE OR REPLACE FUNCTION update_source(p_name TEXT, p_constraint_fields TEXT[] DEFAULT NULL, p_config JSONB DEFAULT NULL, p_global_picklist BOOLEAN DEFAULT NULL) CREATE OR REPLACE FUNCTION update_source(p_name TEXT, p_dedup_fields TEXT[] DEFAULT NULL, p_config JSONB DEFAULT NULL)
RETURNS dataflow.sources AS $$ RETURNS dataflow.sources AS $$
UPDATE dataflow.sources UPDATE dataflow.sources
SET constraint_fields = COALESCE(p_constraint_fields, constraint_fields), SET dedup_fields = COALESCE(p_dedup_fields, dedup_fields),
config = COALESCE(p_config, config), config = COALESCE(p_config, config),
global_picklist = COALESCE(p_global_picklist, global_picklist), updated_at = CURRENT_TIMESTAMP
updated_at = CURRENT_TIMESTAMP
WHERE name = p_name WHERE name = p_name
RETURNING *; RETURNING *;
$$ LANGUAGE sql; $$ LANGUAGE sql;
@ -42,6 +41,13 @@ $$ LANGUAGE sql;
-- ── Import log ──────────────────────────────────────────────────────────────── -- ── Import log ────────────────────────────────────────────────────────────────
CREATE OR REPLACE FUNCTION get_import_log(p_source_name TEXT)
RETURNS SETOF dataflow.import_log AS $$
SELECT * FROM dataflow.import_log
WHERE source_name = p_source_name
ORDER BY imported_at DESC;
$$ LANGUAGE sql STABLE;
-- ── Stats ───────────────────────────────────────────────────────────────────── -- ── Stats ─────────────────────────────────────────────────────────────────────
CREATE OR REPLACE FUNCTION get_source_stats(p_source_name TEXT) CREATE OR REPLACE FUNCTION get_source_stats(p_source_name TEXT)
@ -81,21 +87,16 @@ $$ LANGUAGE sql STABLE;
CREATE OR REPLACE FUNCTION get_view_data( CREATE OR REPLACE FUNCTION get_view_data(
p_source_name TEXT, p_source_name TEXT,
p_limit INT DEFAULT 100, p_limit INT DEFAULT 100,
p_offset INT DEFAULT 0, p_offset INT DEFAULT 0,
p_sort_col TEXT DEFAULT NULL, p_sort_col TEXT DEFAULT NULL,
p_sort_dir TEXT DEFAULT 'asc', p_sort_dir TEXT DEFAULT 'asc'
p_filters JSONB DEFAULT NULL -- [{col, pattern}, ...] — postgres regex (~*)
) )
RETURNS JSON AS $$ RETURNS JSON AS $$
DECLARE DECLARE
v_exists BOOLEAN; v_exists BOOLEAN;
v_where TEXT := ''; v_order TEXT := '';
v_order TEXT := ''; v_rows JSON;
v_rows JSON;
v_filter JSONB;
v_col TEXT;
v_pattern TEXT;
BEGIN BEGIN
SELECT EXISTS ( SELECT EXISTS (
SELECT 1 FROM information_schema.views SELECT 1 FROM information_schema.views
@ -106,24 +107,6 @@ BEGIN
RETURN json_build_object('exists', FALSE, 'rows', '[]'::json); RETURN json_build_object('exists', FALSE, 'rows', '[]'::json);
END IF; END IF;
-- Build WHERE from filters (validate each column exists in the view)
IF p_filters IS NOT NULL THEN
FOR v_filter IN SELECT value FROM jsonb_array_elements(p_filters) LOOP
v_col := v_filter->>'col';
v_pattern := v_filter->>'pattern';
IF v_pattern IS NOT NULL AND v_pattern <> '' AND EXISTS (
SELECT 1 FROM information_schema.columns
WHERE table_schema = 'dfv'
AND table_name = p_source_name
AND column_name = v_col
) THEN
v_where := v_where ||
CASE WHEN v_where = '' THEN ' WHERE ' ELSE ' AND ' END ||
quote_ident(v_col) || '::text ~* ' || quote_literal(v_pattern);
END IF;
END LOOP;
END IF;
IF p_sort_col IS NOT NULL AND EXISTS ( IF p_sort_col IS NOT NULL AND EXISTS (
SELECT 1 FROM information_schema.columns SELECT 1 FROM information_schema.columns
WHERE table_schema = 'dfv' WHERE table_schema = 'dfv'
@ -135,15 +118,156 @@ BEGIN
|| ' NULLS LAST'; || ' NULLS LAST';
END IF; END IF;
-- Subquery applies ORDER BY + LIMIT first, then json_agg collects in that order.
-- json_agg on the outer query preserves column order (json not jsonb).
EXECUTE format( EXECUTE format(
'SELECT COALESCE(json_agg(row_to_json(t)), ''[]''::json) FROM (SELECT * FROM dfv.%I%s%s LIMIT %s OFFSET %s) t', 'SELECT COALESCE(json_agg(row_to_json(t)), ''[]''::json) FROM (SELECT * FROM dfv.%I%s LIMIT %s OFFSET %s) t',
p_source_name, v_where, v_order, p_limit, p_offset p_source_name, v_order, p_limit, p_offset
) INTO v_rows; ) INTO v_rows;
RETURN json_build_object('exists', TRUE, 'rows', v_rows); RETURN json_build_object('exists', TRUE, 'rows', v_rows);
END; END;
$$ LANGUAGE plpgsql STABLE; $$ LANGUAGE plpgsql STABLE;
-- ── Import (deduplication) ────────────────────────────────────────────────────
CREATE OR REPLACE FUNCTION import_records(p_source_name TEXT, p_data JSONB)
RETURNS JSON AS $$
DECLARE
v_dedup_fields TEXT[];
v_record JSONB;
v_dedup_key TEXT;
v_inserted INTEGER := 0;
v_duplicates INTEGER := 0;
v_log_id INTEGER;
BEGIN
SELECT dedup_fields INTO v_dedup_fields
FROM dataflow.sources WHERE name = p_source_name;
IF v_dedup_fields IS NULL THEN
RETURN json_build_object('success', false, 'error', 'Source not found: ' || p_source_name);
END IF;
FOR v_record IN SELECT * FROM jsonb_array_elements(p_data) LOOP
v_dedup_key := dataflow.generate_dedup_key(v_record, v_dedup_fields);
BEGIN
INSERT INTO dataflow.records (source_name, data, dedup_key)
VALUES (p_source_name, v_record, v_dedup_key);
v_inserted := v_inserted + 1;
EXCEPTION WHEN unique_violation THEN
v_duplicates := v_duplicates + 1;
END;
END LOOP;
INSERT INTO dataflow.import_log (source_name, records_imported, records_duplicate)
VALUES (p_source_name, v_inserted, v_duplicates)
RETURNING id INTO v_log_id;
RETURN json_build_object('success', true, 'imported', v_inserted, 'duplicates', v_duplicates, 'log_id', v_log_id);
END;
$$ LANGUAGE plpgsql;
-- ── Transformations ───────────────────────────────────────────────────────────
CREATE OR REPLACE FUNCTION dataflow.jsonb_merge(a JSONB, b JSONB)
RETURNS JSONB AS $$
SELECT COALESCE(a, '{}') || COALESCE(b, '{}')
$$ LANGUAGE sql IMMUTABLE;
DROP AGGREGATE IF EXISTS dataflow.jsonb_concat_obj(JSONB);
CREATE AGGREGATE dataflow.jsonb_concat_obj(JSONB) (
sfunc = dataflow.jsonb_merge,
stype = JSONB,
initcond = '{}'
);
DROP FUNCTION IF EXISTS apply_transformations(TEXT, INTEGER[]);
CREATE OR REPLACE FUNCTION apply_transformations(
p_source_name TEXT,
p_record_ids INTEGER[] DEFAULT NULL,
p_overwrite BOOLEAN DEFAULT FALSE
) RETURNS JSON AS $$
WITH
qualifying AS (
SELECT id, data FROM dataflow.records
WHERE source_name = p_source_name
AND (p_overwrite OR transformed IS NULL)
AND (p_record_ids IS NULL OR id = ANY(p_record_ids))
),
rx AS (
SELECT
q.id,
r.name AS rule_name,
r.sequence,
r.output_field,
r.retain,
r.function_type,
COALESCE(mt.rn, rp.rn, 1) AS result_number,
CASE WHEN array_length(mt.mt, 1) = 1 THEN to_jsonb(mt.mt[1]) ELSE to_jsonb(mt.mt) END AS match_val,
to_jsonb(rp.rp) AS replace_val
FROM dataflow.rules r
INNER JOIN qualifying q ON q.data ? r.field
LEFT JOIN LATERAL regexp_matches(q.data ->> r.field, r.pattern, r.flags)
WITH ORDINALITY AS mt(mt, rn) ON r.function_type = 'extract'
LEFT JOIN LATERAL regexp_replace(q.data ->> r.field, r.pattern, r.replace_value, r.flags)
WITH ORDINALITY AS rp(rp, rn) ON r.function_type = 'replace'
WHERE r.source_name = p_source_name AND r.enabled = true
),
agg_matches AS (
SELECT
id, rule_name, sequence, output_field, retain, function_type,
CASE function_type
WHEN 'replace' THEN jsonb_agg(replace_val) -> 0
ELSE
CASE WHEN max(result_number) = 1
THEN jsonb_agg(match_val ORDER BY result_number) -> 0
ELSE jsonb_agg(match_val ORDER BY result_number)
END
END AS extracted
FROM rx
GROUP BY id, rule_name, sequence, output_field, retain, function_type
),
linked AS (
SELECT
a.id, a.sequence, a.output_field, a.retain, a.extracted, m.output AS mapped
FROM agg_matches a
LEFT JOIN dataflow.mappings m ON
m.source_name = p_source_name
AND m.rule_name = a.rule_name
AND m.input_value = a.extracted
WHERE a.extracted IS NOT NULL
),
rule_output AS (
SELECT
id, sequence,
CASE
WHEN mapped IS NOT NULL THEN
mapped || CASE WHEN retain THEN jsonb_build_object(output_field, extracted) ELSE '{}'::jsonb END
ELSE jsonb_build_object(output_field, extracted)
END AS output
FROM linked
),
record_additions AS (
SELECT id, dataflow.jsonb_concat_obj(output ORDER BY sequence) AS additions
FROM rule_output GROUP BY id
),
updated AS (
UPDATE dataflow.records rec
SET transformed = rec.data || COALESCE(ra.additions, '{}'::jsonb),
transformed_at = CURRENT_TIMESTAMP
FROM qualifying q
LEFT JOIN record_additions ra ON ra.id = q.id
WHERE rec.id = q.id
RETURNING rec.id
)
SELECT json_build_object('success', true, 'transformed', count(*)) FROM updated
$$ LANGUAGE sql;
CREATE OR REPLACE FUNCTION reprocess_records(p_source_name TEXT)
RETURNS JSON AS $$
SELECT dataflow.apply_transformations(p_source_name, NULL, TRUE)
$$ LANGUAGE sql;
-- ── View generation ─────────────────────────────────────────────────────────── -- ── View generation ───────────────────────────────────────────────────────────
CREATE OR REPLACE FUNCTION generate_source_view(p_source_name TEXT) CREATE OR REPLACE FUNCTION generate_source_view(p_source_name TEXT)
@ -196,28 +320,3 @@ BEGIN
RETURN json_build_object('success', true, 'view', v_view, 'sql', v_sql); RETURN json_build_object('success', true, 'view', v_view, 'sql', v_sql);
END; END;
$$ LANGUAGE plpgsql; $$ LANGUAGE plpgsql;
-- List saved pivot layouts for a source
CREATE OR REPLACE FUNCTION list_pivot_layouts(p_source_name TEXT)
RETURNS TABLE(id INT, source_name TEXT, layout_name TEXT, config JSONB, created_at TIMESTAMPTZ) AS $$
SELECT id, source_name, layout_name, config, created_at
FROM dataflow.pivot_layouts
WHERE source_name = p_source_name
ORDER BY layout_name;
$$ LANGUAGE sql;
-- Save (upsert) a named pivot layout
CREATE OR REPLACE FUNCTION save_pivot_layout(p_source_name TEXT, p_layout_name TEXT, p_config JSONB)
RETURNS TABLE(id INT, source_name TEXT, layout_name TEXT, config JSONB, created_at TIMESTAMPTZ) AS $$
INSERT INTO dataflow.pivot_layouts (source_name, layout_name, config)
VALUES (p_source_name, p_layout_name, p_config)
ON CONFLICT (source_name, layout_name) DO UPDATE
SET config = EXCLUDED.config
RETURNING id, source_name, layout_name, config, created_at;
$$ LANGUAGE sql;
-- Delete a named pivot layout
CREATE OR REPLACE FUNCTION delete_pivot_layout(p_id INT)
RETURNS TABLE(id INT) AS $$
DELETE FROM dataflow.pivot_layouts WHERE id = p_id RETURNING id;
$$ LANGUAGE sql;

View File

@ -15,15 +15,14 @@ SET search_path TO dataflow, public;
------------------------------------------------------ ------------------------------------------------------
CREATE TABLE sources ( CREATE TABLE sources (
name TEXT PRIMARY KEY, name TEXT PRIMARY KEY,
constraint_fields TEXT[] NOT NULL, -- Fields that uniquely identify a record (e.g., ['date', 'amount', 'description']) dedup_fields TEXT[] NOT NULL, -- Fields used for deduplication (e.g., ['date', 'amount', 'description'])
config JSONB DEFAULT '{}'::jsonb, config JSONB DEFAULT '{}'::jsonb,
global_picklist BOOLEAN NOT NULL DEFAULT true, -- Contribute output values to global autocomplete suggestions
created_at TIMESTAMPTZ DEFAULT CURRENT_TIMESTAMP, created_at TIMESTAMPTZ DEFAULT CURRENT_TIMESTAMP,
updated_at TIMESTAMPTZ DEFAULT CURRENT_TIMESTAMP updated_at TIMESTAMPTZ DEFAULT CURRENT_TIMESTAMP
); );
COMMENT ON TABLE sources IS 'Data source definitions'; COMMENT ON TABLE sources IS 'Data source definitions';
COMMENT ON COLUMN sources.constraint_fields IS 'Array of field names that uniquely identify a record'; COMMENT ON COLUMN sources.dedup_fields IS 'Array of field names used to identify duplicate records';
COMMENT ON COLUMN sources.config IS 'Additional source configuration (optional)'; COMMENT ON COLUMN sources.config IS 'Additional source configuration (optional)';
------------------------------------------------------ ------------------------------------------------------
@ -36,7 +35,7 @@ CREATE TABLE records (
-- Data -- Data
data JSONB NOT NULL, -- Original imported data data JSONB NOT NULL, -- Original imported data
constraint_key JSONB, -- Fields that uniquely identify this record (set on import) dedup_key TEXT NOT NULL, -- Hash of dedup fields for fast lookup
transformed JSONB, -- Data after transformations applied transformed JSONB, -- Data after transformations applied
-- Metadata -- Metadata
@ -44,17 +43,18 @@ CREATE TABLE records (
imported_at TIMESTAMPTZ DEFAULT CURRENT_TIMESTAMP, imported_at TIMESTAMPTZ DEFAULT CURRENT_TIMESTAMP,
transformed_at TIMESTAMPTZ, transformed_at TIMESTAMPTZ,
-- Constraints
UNIQUE(source_name, dedup_key) -- Prevent duplicates
); );
COMMENT ON TABLE records IS 'Imported records with raw and transformed data'; COMMENT ON TABLE records IS 'Imported records with raw and transformed data';
COMMENT ON COLUMN records.data IS 'Original data as imported'; COMMENT ON COLUMN records.data IS 'Original data as imported';
COMMENT ON COLUMN records.constraint_key IS 'JSONB object of constraint field values — uniquely identifies this record within its source'; COMMENT ON COLUMN records.dedup_key IS 'Hash of deduplication fields for fast duplicate detection';
COMMENT ON COLUMN records.transformed IS 'Data after applying transformation rules'; COMMENT ON COLUMN records.transformed IS 'Data after applying transformation rules';
-- Indexes -- Indexes
CREATE INDEX idx_records_source ON records(source_name); CREATE INDEX idx_records_source ON records(source_name);
CREATE INDEX idx_records_constraint ON records USING gin(constraint_key); CREATE INDEX idx_records_dedup ON records(source_name, dedup_key);
CREATE INDEX idx_records_data ON records USING gin(data); CREATE INDEX idx_records_data ON records USING gin(data);
CREATE INDEX idx_records_transformed ON records USING gin(transformed); CREATE INDEX idx_records_transformed ON records USING gin(transformed);
@ -139,22 +139,33 @@ COMMENT ON COLUMN import_log.info IS 'Import details: inserted_keys and excluded
CREATE INDEX idx_import_log_source ON import_log(source_name); CREATE INDEX idx_import_log_source ON import_log(source_name);
CREATE INDEX idx_import_log_timestamp ON import_log(imported_at); CREATE INDEX idx_import_log_timestamp ON import_log(imported_at);
------------------------------------------------------
-- Helper function: Generate dedup key
------------------------------------------------------
CREATE OR REPLACE FUNCTION generate_dedup_key(
data JSONB,
dedup_fields TEXT[]
) RETURNS TEXT AS $$
DECLARE
field TEXT;
values TEXT := '';
BEGIN
-- Concatenate values from dedup fields
FOREACH field IN ARRAY dedup_fields LOOP
values := values || COALESCE(data->>field, '') || '|';
END LOOP;
CREATE TABLE pivot_layouts ( -- Return MD5 hash of concatenated values
id SERIAL PRIMARY KEY, RETURN md5(values);
source_name TEXT NOT NULL REFERENCES sources(name) ON DELETE CASCADE, END;
layout_name TEXT NOT NULL, $$ LANGUAGE plpgsql IMMUTABLE;
config JSONB NOT NULL,
created_at TIMESTAMPTZ DEFAULT CURRENT_TIMESTAMP,
UNIQUE (source_name, layout_name)
);
CREATE INDEX idx_pivot_layouts_source ON pivot_layouts(source_name); COMMENT ON FUNCTION generate_dedup_key IS 'Generate hash key from specified fields for deduplication';
------------------------------------------------------ ------------------------------------------------------
-- Summary -- Summary
------------------------------------------------------ ------------------------------------------------------
-- Tables: 6 (sources, records, rules, mappings, import_log, pivot_layouts) -- Tables: 5 (sources, records, rules, mappings, import_log)
-- Simple, clear structure -- Simple, clear structure
-- JSONB for flexibility -- JSONB for flexibility
-- Deduplication via hash key -- Deduplication via hash key

View File

@ -1,285 +0,0 @@
# Perspective Pivot — Technical Reference
Version tested: `@perspective-dev` v4.4.0 (client, viewer, viewer-datagrid, viewer-d3fc), loaded from CDN.
This document captures everything learned about controlling Perspective programmatically. The official docs are incomplete for some of these APIs — treat this as a ground-truth supplement.
---
## Loading from CDN
```js
const [{ default: perspective }] = await Promise.all([
import('https://cdn.jsdelivr.net/npm/@perspective-dev/client@4.4.0/dist/cdn/perspective.js'),
import('https://cdn.jsdelivr.net/npm/@perspective-dev/viewer@4.4.0/dist/cdn/perspective-viewer.js'),
import('https://cdn.jsdelivr.net/npm/@perspective-dev/viewer-datagrid@4.4.0/dist/cdn/perspective-viewer-datagrid.js'),
import('https://cdn.jsdelivr.net/npm/@perspective-dev/viewer-d3fc@4.4.0/dist/cdn/perspective-viewer-d3fc.js'),
])
```
Stylesheet:
```html
<link rel="stylesheet" crossorigin="anonymous"
href="https://cdn.jsdelivr.net/npm/@perspective-dev/viewer/dist/css/themes.css" />
```
---
## Core Objects
```
perspective — the module default export
.worker() — creates a Web Worker instance
worker
.table(rows, opts) — creates a named Table; returns the Table object
.open_table(name) — re-opens a previously created named table
table
.view(config) — creates a View (filtered/grouped projection)
.update(rows) — incremental row upsert/insert
view
.to_json() — returns rows as array of objects
.set_depth(n) — sets expansion depth for all grouped rows (see below)
.delete() — frees the view; always call when done
viewer (the <perspective-viewer> DOM element)
.load(worker) — attaches the worker to the viewer
.save() — returns full viewer config as plain object
.restore(config) — applies a config object to the viewer
.flush() — forces viewer to synchronize (limited effect on plugin state)
.getPlugin() — returns the active plugin element (e.g. datagrid)
.getView() — returns the current View object
.toggleConfig() — shows/hides the settings panel
plugin (datagrid element, from viewer.getPlugin())
.save() — returns plugin-specific state: { columns, scroll_lock, edit_mode }
.restore(config) — applies plugin-specific state
.draw(view) — redraws the plugin against the given View
```
---
## viewer.save() — Config Shape
```js
{
table: "source_name",
plugin: "datagrid", // or "d3_y_bar", etc.
plugin_config: { ... }, // NOT reliably populated — use plugin.save() instead
group_by: ["field1"],
split_by: ["field2"],
columns: ["Amount"],
filter: [["field", "op", "value"]],
sort: [["field", "asc"]],
expressions: { "ExprName": "// formula\n..." },
settings: false, // whether the config panel is open
}
```
**Important:** `plugin_config` in `viewer.save()` is NOT reliably populated in v4.4.0. Use `plugin.save()` separately to capture plugin state.
---
## plugin.save() — Plugin State Shape (datagrid)
```js
{
columns: {}, // per-column formatting overrides
scroll_lock: false,
edit_mode: "SELECT_REGION" // see valid values below
}
```
---
## Selection Modes (edit_mode)
Valid values for the datagrid plugin's `edit_mode` field:
| Value | Button label | Behavior |
|---|---|---|
| `READ_ONLY` | Read-Only | No selection highlight |
| `SELECT_ROW` | Select Row | Highlights full rows |
| `SELECT_COLUMN` | Select Column | Highlights full columns |
| `SELECT_REGION` | Select Region | Highlights clicked cell region |
| `EDIT` | Edit | Enables cell editing |
The built-in button in the viewer toolbar cycles through these in order.
**Setting the default:**
```js
// After viewer.restore(...), set it directly on the plugin:
const plugin = await viewer.getPlugin()
await plugin.restore({ edit_mode: 'SELECT_REGION' })
```
Setting via `viewer.restore({ plugin_config: { edit_mode: ... } })` does NOT reliably work in v4.4.0.
---
## Expand/Collapse Row Depth
Controls how many levels of the `group_by` hierarchy are expanded. This is the only working mechanism found in v4.4.0:
```js
const view = await viewer.getView()
await view.set_depth(depth) // 0 = collapse all, 1 = expand one level, etc.
const plugin = await viewer.getPlugin()
await plugin.draw(view) // required — viewer does not redraw automatically
```
**What does NOT work:**
- `viewer.restore({ plugin_config: { expand_depth: d } })` — silently ignored
- `view.set_depth(d)` alone — view state changes but display doesn't update
- `view.set_depth(d)` + `viewer.flush()` — still no visual update
- `plugin.restore({ expand_depth: d })` — "Unknown" field, ignored
**The `plugin.draw(view)` call is required** to make the datagrid re-render after `set_depth`.
---
## Saving and Restoring Full State
To capture complete state (viewer + plugin + expand depth):
```js
async function captureConfig(viewer, expandDepth) {
const plugin = await viewer.getPlugin()
const [viewerConfig, pluginConfig] = await Promise.all([viewer.save(), plugin.save()])
return { ...viewerConfig, plugin_config: pluginConfig, expand_depth: expandDepth }
}
```
To restore:
```js
async function restoreConfig(viewer, config, applyDepth) {
await viewer.restore(config)
if (config.plugin_config) {
const plugin = await viewer.getPlugin()
await plugin.restore(config.plugin_config)
}
if (config.expand_depth != null) {
await applyDepth(viewer, config.expand_depth)
}
await viewer.flush()
}
async function applyDepth(viewer, depth) {
const view = await viewer.getView()
await view.set_depth(depth)
const plugin = await viewer.getPlugin()
await plugin.draw(view)
}
```
---
## The perspective-click Event
Fires when the user clicks a cell. The event detail:
```js
viewer.addEventListener('perspective-click', async (e) => {
const { row, column_names, config } = e.detail
// row — aggregated values for the clicked cell (keyed by "split|metric" format)
// column_names — array of metric column names clicked
// config — { filter: [[field, op, value], ...] }
// filter includes:
// - group_by coordinate filters (field == value, one per group_by level)
// - split_by coordinate filters (field == value, one per split_by field)
// - user-set filters (any op)
})
```
`__ROW_PATH__` in `row` contains the group_by path as an array.
**The `config.filter` array is the reliable way to get cell coordinates.** Do not try to zip `__ROW_PATH__` with `group_by` — the filter approach handles all cases including partial paths.
---
## Filtering Rows for a Clicked Cell
The click event's `filter` array can be applied to the underlying table via a new View, which correctly evaluates expression/computed columns (unlike filtering raw JS rows):
```js
const config = await viewer.save()
const view = await table.view({
filter: eventFilters,
expressions: config.expressions || [],
})
const rows = await view.to_json()
await view.delete()
// Strip expression columns from results (they're computed, not source fields)
const exprNames = new Set(Object.keys(config.expressions || {}))
const clean = rows.map(r =>
Object.fromEntries(Object.entries(r).filter(([k]) => !exprNames.has(k)))
)
```
**Why not filter raw JS rows?** Expression columns (computed in Perspective) don't exist in the source data. `filterRowsByConfig` on raw rows will skip those filters, returning all rows for the group rather than the specific cell.
**Guard against no group_by:** Without `group_by`, the filter array has no coordinate filters and the view query returns the entire table (slow). Check first:
```js
const config = await viewer.save()
if ((config.group_by || []).length === 0) return // no hierarchy — skip inspector
```
---
## Viewer Methods (full list, v4.4.0)
From `Object.getOwnPropertyNames(Object.getPrototypeOf(viewer))`:
`constructor`, `__destroy_into_raw`, `free`, `__get_model`, `connectedCallback`, `copy`, `delete`, `download`, `eject`, `export`, `flush`, `getAllPlugins`, `getClient`, `getEditPort`, `getPlugin`, `getRenderStats`, `getSelection`, `getTable`, `getView`, `getViewConfig`, `load`, `openColumnSettings`, `reset`, `resetError`, `resetThemes`, `resize`, `restore`, `restyleElement`, `save`, `setAutoPause`, `setAutoSize`, `setSelection`, `setThrottle`, `toggleColumnSettings`, `toggleConfig`
## Plugin Methods (datagrid, full list)
From `Object.getOwnPropertyNames(Object.getPrototypeOf(plugin))`:
`constructor`, `connectedCallback`, `disconnectedCallback`, `activate`, `name`, `category`, `select_mode`, `min_config_columns`, `config_column_names`, `group_rollups`, `priority`, `can_render_column_styles`, `column_style_controls`, `draw`, `update`, `render`, `resize`, `clear`, `save`, `restore`, `restyle`, `delete`
## View Methods (full list)
From `Object.getOwnPropertyNames(Object.getPrototypeOf(view))` — includes `set_depth`, `expand`, `collapse`, `to_json`, `to_csv`, `to_arrow`, `schema`, `num_rows`, `num_columns`, `delete`, and others.
---
## settings Panel
The `settings` key in `viewer.restore()` controls whether the config panel (gear icon) is open:
```js
// Hide on load:
await viewer.restore({ table: "name", settings: false, plugin_config: DEFAULT_PLUGIN_CONFIG })
// Toggle programmatically:
viewer.toggleConfig()
```
The settings state is saved by `viewer.save()` and restored on `viewer.restore()`, so it persists across layout saves automatically.
---
## Incremental Updates
To update the table data without a full reload:
```js
table.update(newRows) // upserts by index (or by index_col if specified at table creation)
```
The viewer re-renders automatically after `table.update()`.
---
## Common Pitfalls
- **`plugin_config` in `viewer.restore()` is unreliable.** Always set plugin state via `plugin.restore()` separately after `viewer.restore()`.
- **`view.set_depth()` requires `plugin.draw(view)`.** The viewer won't redraw automatically.
- **Expression columns don't exist in raw data.** Filter via a Perspective View (`table.view({ filter, expressions })`), not against raw JS rows.
- **Always `await view.delete()`** after using a temporary view, or you'll leak worker memory.
- **Named tables:** `worker.table(rows, { name: 'foo' })` — the name is used by the viewer's `table` config key. Re-open with `worker.open_table('foo')`.

View File

@ -42,7 +42,7 @@ curl -X POST http://localhost:3000/api/sources \
-H "Content-Type: application/json" \ -H "Content-Type: application/json" \
-d '{ -d '{
"name": "bank_transactions", "name": "bank_transactions",
"constraint_fields": ["date", "description", "amount"] "dedup_fields": ["date", "description", "amount"]
}' }'
``` ```
@ -303,7 +303,7 @@ curl -X POST http://localhost:3000/api/records/search \
**Import fails:** **Import fails:**
- Verify source exists: `curl http://localhost:3000/api/sources` - Verify source exists: `curl http://localhost:3000/api/sources`
- Check CSV format matches expectations - Check CSV format matches expectations
- Ensure constraint_fields match CSV column names - Ensure dedup_fields match CSV column names
**Transformations not working:** **Transformations not working:**
- Check rules exist: `curl http://localhost:3000/api/rules/source/bank_transactions` - Check rules exist: `curl http://localhost:3000/api/rules/source/bank_transactions`

View File

@ -8,17 +8,13 @@ import Rules from './pages/Rules'
import Mappings from './pages/Mappings' import Mappings from './pages/Mappings'
import Records from './pages/Records' import Records from './pages/Records'
import Log from './pages/Log' import Log from './pages/Log'
import Pivot from './pages/Pivot'
import Remap from './pages/Remap'
const NAV = [ const NAV = [
{ to: '/sources', label: 'Sources' }, { to: '/sources', label: 'Sources' },
{ to: '/import', label: 'Import' }, { to: '/import', label: 'Import' },
{ to: '/rules', label: 'Rules' }, { to: '/rules', label: 'Rules' },
{ to: '/mappings', label: 'Mappings' }, { to: '/mappings', label: 'Mappings' },
{ to: '/remap', label: 'Remap' },
{ to: '/records', label: 'Records' }, { to: '/records', label: 'Records' },
{ to: '/pivot', label: 'Pivot' },
{ to: '/log', label: 'Log' }, { to: '/log', label: 'Log' },
] ]
@ -81,7 +77,7 @@ export default function App() {
<div className="px-3 py-3 border-b border-gray-200"> <div className="px-3 py-3 border-b border-gray-200">
<div className="flex items-center justify-between mb-1"> <div className="flex items-center justify-between mb-1">
<label className="text-xs text-gray-500">Source</label> <label className="text-xs text-gray-500">Source</label>
<NavLink to="/sources?new=1" className="text-xs text-blue-400 hover:text-blue-600 leading-none" title="New source" onClick={() => setSidebarOpen(false)}>+</NavLink> <NavLink to="/sources" className="text-xs text-blue-400 hover:text-blue-600 leading-none" title="New source" onClick={() => setSidebarOpen(false)}>+</NavLink>
</div> </div>
<select <select
className="w-full text-sm border border-gray-200 rounded px-2 py-1 bg-white focus:outline-none focus:border-blue-400" className="w-full text-sm border border-gray-200 rounded px-2 py-1 bg-white focus:outline-none focus:border-blue-400"
@ -146,9 +142,7 @@ export default function App() {
<Route path="/import" element={<Import source={source} />} /> <Route path="/import" element={<Import source={source} />} />
<Route path="/rules" element={<Rules source={source} />} /> <Route path="/rules" element={<Rules source={source} />} />
<Route path="/mappings" element={<Mappings source={source} />} /> <Route path="/mappings" element={<Mappings source={source} />} />
<Route path="/remap" element={<Remap />} />
<Route path="/records" element={<Records source={source} />} /> <Route path="/records" element={<Records source={source} />} />
<Route path="/pivot" element={<Pivot source={source} />} />
<Route path="/log" element={<Log />} /> <Route path="/log" element={<Log />} />
</Routes> </Routes>
</div> </div>

View File

@ -10,11 +10,6 @@ export function clearCredentials() {
_credentials = null _credentials = null
} }
export function authHeaders() {
if (!_credentials) return {}
return { 'Authorization': `Basic ${btoa(`${_credentials.user}:${_credentials.pass}`)}` }
}
async function request(method, path, body, isFormData = false) { async function request(method, path, body, isFormData = false) {
const opts = { method, headers: {} } const opts = { method, headers: {} }
@ -70,10 +65,9 @@ export const api = {
reprocess: (name) => request('POST', `/sources/${name}/reprocess`), reprocess: (name) => request('POST', `/sources/${name}/reprocess`),
generateView: (name) => request('POST', `/sources/${name}/view`), generateView: (name) => request('POST', `/sources/${name}/view`),
getFields: (name) => request('GET', `/sources/${name}/fields`), getFields: (name) => request('GET', `/sources/${name}/fields`),
getViewData: (name, limit = 100, offset = 0, sortCol = null, sortDir = 'asc', filters = null) => { getViewData: (name, limit = 100, offset = 0, sortCol = null, sortDir = 'asc') => {
const params = new URLSearchParams({ limit, offset }) const params = new URLSearchParams({ limit, offset })
if (sortCol) { params.set('sort_col', sortCol); params.set('sort_dir', sortDir) } if (sortCol) { params.set('sort_col', sortCol); params.set('sort_dir', sortDir) }
if (filters && filters.length > 0) params.set('filters', JSON.stringify(filters))
return request('GET', `/sources/${name}/view-data?${params}`) return request('GET', `/sources/${name}/view-data?${params}`)
}, },
@ -87,7 +81,6 @@ export const api = {
request('GET', `/rules/preview?source=${encodeURIComponent(source)}&field=${encodeURIComponent(field)}&pattern=${encodeURIComponent(pattern)}&flags=${encodeURIComponent(flags || '')}&function_type=${function_type}&replace_value=${encodeURIComponent(replace_value)}&limit=${limit}`), request('GET', `/rules/preview?source=${encodeURIComponent(source)}&field=${encodeURIComponent(field)}&pattern=${encodeURIComponent(pattern)}&flags=${encodeURIComponent(flags || '')}&function_type=${function_type}&replace_value=${encodeURIComponent(replace_value)}&limit=${limit}`),
// Mappings // Mappings
getGlobalValues: () => request('GET', '/mappings/global-values'),
getMappings: (source, rule) => request('GET', `/mappings/source/${source}${rule ? `?rule_name=${rule}` : ''}`), getMappings: (source, rule) => request('GET', `/mappings/source/${source}${rule ? `?rule_name=${rule}` : ''}`),
getMappingCounts: (source, rule) => request('GET', `/mappings/source/${source}/counts${rule ? `?rule_name=${rule}` : ''}`), getMappingCounts: (source, rule) => request('GET', `/mappings/source/${source}/counts${rule ? `?rule_name=${rule}` : ''}`),
getUnmapped: (source, rule) => request('GET', `/mappings/source/${source}/unmapped${rule ? `?rule_name=${rule}` : ''}`), getUnmapped: (source, rule) => request('GET', `/mappings/source/${source}/unmapped${rule ? `?rule_name=${rule}` : ''}`),
@ -103,16 +96,6 @@ export const api = {
updateMapping: (id, body) => request('PUT', `/mappings/${id}`, body), updateMapping: (id, body) => request('PUT', `/mappings/${id}`, body),
deleteMapping: (id) => request('DELETE', `/mappings/${id}`), deleteMapping: (id) => request('DELETE', `/mappings/${id}`),
// Global remap
searchMappingOutputs: (search) => request('GET', `/mappings/outputs?search=${encodeURIComponent(search)}`),
getMappingsByOutputField: (col, val) => request('GET', `/mappings/outputs/${encodeURIComponent(col)}/${encodeURIComponent(val)}`),
remapOutputField: (col, from_val, to_val) => request('POST', '/mappings/remap-field', { col, from_val, to_val }),
// Pivot layouts
getPivotLayouts: (source) => request('GET', `/sources/${source}/layouts`),
savePivotLayout: (source, layout_name, config) => request('POST', `/sources/${source}/layouts`, { layout_name, config }),
deletePivotLayout: (source, id) => request('DELETE', `/sources/${source}/layouts/${id}`),
// Records // Records
getRecords: (source, limit = 100, offset = 0) => getRecords: (source, limit = 100, offset = 0) =>
request('GET', `/records/source/${source}?limit=${limit}&offset=${offset}`), request('GET', `/records/source/${source}?limit=${limit}&offset=${offset}`),

View File

@ -196,24 +196,8 @@ export default function Import({ source }) {
{error && <p className="text-sm text-red-500 mb-3">{error}</p>} {error && <p className="text-sm text-red-500 mb-3">{error}</p>}
{result && ( {result && (
<div className={`border rounded p-4 mb-4 text-sm ${result.success === false ? 'bg-red-50 border-red-200' : 'bg-white border-gray-200'}`}> <div className="bg-white border border-gray-200 rounded p-4 mb-4 text-sm">
{result.success === false ? ( {result.imported !== undefined ? (
<>
<p className="text-red-600 font-medium mb-2">{result.error}</p>
{result.duplicate_rows && (
<div>
<p className="text-xs text-red-500 mb-1">Offending rows:</p>
<div className="max-h-48 overflow-y-auto bg-white rounded border border-red-100 p-2 font-mono text-xs text-red-700 space-y-0.5">
{result.duplicate_rows.map((row, i) => (
<div key={i}>
{Object.entries(row).map(([f, v]) => `${f}: ${v}`).join(' · ')}
</div>
))}
</div>
</div>
)}
</>
) : result.imported !== undefined ? (
<> <>
<span className="text-green-600 font-medium">{result.imported} imported</span> <span className="text-green-600 font-medium">{result.imported} imported</span>
<span className="text-gray-400 mx-2">·</span> <span className="text-gray-400 mx-2">·</span>

View File

@ -1,86 +1,5 @@
import { useState, useEffect, useRef } from 'react' import { useState, useEffect } from 'react'
import { api, authHeaders } from '../api' import { api } from '../api'
function AutocompleteInput({ value, onChange, onEnter, suggestions = [], className, placeholder }) {
const [open, setOpen] = useState(false)
const [highlighted, setHighlighted] = useState(0)
const inputRef = useRef()
const listRef = useRef()
const filtered = value
? suggestions.filter(s => s.toLowerCase().includes(value.toLowerCase()))
: suggestions
function openList() {
setOpen(true)
setHighlighted(0)
}
function select(val) {
onChange(val)
setOpen(false)
inputRef.current?.focus()
}
function handleKeyDown(e) {
if (e.altKey && e.key === 'ArrowDown') {
e.preventDefault()
openList()
return
}
if (open && filtered.length > 0) {
if (e.key === 'Tab') {
e.preventDefault()
setHighlighted(h => (h + 1) % filtered.length)
return
}
if (e.key === 'ArrowDown') { e.preventDefault(); setHighlighted(h => Math.min(h + 1, filtered.length - 1)); return }
if (e.key === 'ArrowUp') { e.preventDefault(); setHighlighted(h => Math.max(h - 1, 0)); return }
if (e.key === 'Enter') { e.preventDefault(); select(filtered[highlighted]); return }
if (e.key === 'Escape') { setOpen(false); return }
}
if (e.key === 'Enter') onEnter?.()
}
// Scroll highlighted item into view
useEffect(() => {
if (!open || !listRef.current) return
const item = listRef.current.children[highlighted]
item?.scrollIntoView({ block: 'nearest' })
}, [highlighted, open])
return (
<div className="relative">
<input
ref={inputRef}
className={className}
value={value}
placeholder={placeholder}
onChange={e => { onChange(e.target.value); if (!open && e.target.value) openList() }}
onKeyDown={handleKeyDown}
onBlur={e => { if (!listRef.current?.contains(e.relatedTarget)) setOpen(false) }}
/>
{open && filtered.length > 0 && (
<div
ref={listRef}
className="absolute z-50 left-0 top-full mt-0.5 bg-white border border-gray-200 rounded shadow-lg max-h-48 overflow-y-auto min-w-full"
>
{filtered.map((s, i) => (
<div
key={s}
className={`px-2 py-1 text-xs cursor-pointer whitespace-nowrap ${
i === highlighted ? 'bg-blue-50 text-blue-700' : 'text-gray-700 hover:bg-gray-50'
}`}
onMouseDown={e => { e.preventDefault(); select(s) }}
>
{s}
</div>
))}
</div>
)}
</div>
)
}
function valueKey(v) { function valueKey(v) {
return Array.isArray(v) ? JSON.stringify(v) : String(v) return Array.isArray(v) ? JSON.stringify(v) : String(v)
@ -116,16 +35,9 @@ export default function Mappings({ source }) {
const [loading, setLoading] = useState(false) const [loading, setLoading] = useState(false)
const [importing, setImporting] = useState(false) const [importing, setImporting] = useState(false)
const [sortBy, setSortBy] = useState(null) const [sortBy, setSortBy] = useState(null)
const [globalValues, setGlobalValues] = useState({})
const [selected, setSelected] = useState(new Set())
const [bulkDraft, setBulkDraft] = useState({})
const [cursorKey, setCursorKey] = useState(null)
const [rowFilter, setRowFilter] = useState('')
const rowRefs = useRef({})
useEffect(() => { useEffect(() => {
if (!source) return if (!source) return
api.getGlobalValues().then(setGlobalValues).catch(() => {})
api.getRules(source).then(r => setRules(r)).catch(() => {}) api.getRules(source).then(r => setRules(r)).catch(() => {})
}, [source]) }, [source])
@ -140,28 +52,12 @@ export default function Mappings({ source }) {
setAllValues(a) setAllValues(a)
setDrafts({}) setDrafts({})
setExtraCols([]) setExtraCols([])
setSelected(new Set())
setBulkDraft({})
setCursorKey(null)
setRowFilter('')
}) })
.catch(() => {}) .catch(() => {})
.finally(() => setLoading(false)) .finally(() => setLoading(false))
}, [source, selectedRule]) }, [source, selectedRule])
// Auto-select all rows matching the regex filter when it changes // Derive output columns and datalist suggestions from mapped rows
useEffect(() => {
if (!rowFilter) return
let re = null
try { re = new RegExp(rowFilter, 'i') } catch { return }
const tabF = filter === 'unmapped' ? allValues.filter(r => !r.is_mapped)
: filter === 'mapped' ? allValues.filter(r => r.is_mapped)
: allValues
const matches = tabF.filter(r => re.test(displayValue(r.extracted_value)))
setSelected(new Set(matches.map(r => valueKey(r.extracted_value))))
}, [rowFilter, filter, allValues])
// Derive output columns and datalist suggestions from mapped rows + global pool
const existingCols = [] const existingCols = []
const valuesByCol = {} const valuesByCol = {}
allValues.forEach(row => { allValues.forEach(row => {
@ -172,31 +68,17 @@ export default function Mappings({ source }) {
valuesByCol[k].add(String(v)) valuesByCol[k].add(String(v))
}) })
}) })
// Merge global picklist values into suggestions
Object.entries(globalValues).forEach(([k, vals]) => {
if (!valuesByCol[k]) valuesByCol[k] = new Set()
vals.forEach(v => valuesByCol[k].add(v))
})
const cols = [...existingCols, ...extraCols] const cols = [...existingCols, ...extraCols]
const unmappedCount = allValues.filter(r => !r.is_mapped).length const unmappedCount = allValues.filter(r => !r.is_mapped).length
const mappedCount = allValues.filter(r => r.is_mapped).length const mappedCount = allValues.filter(r => r.is_mapped).length
const tabFiltered = filter === 'unmapped' const filteredRows = filter === 'unmapped'
? allValues.filter(r => !r.is_mapped) ? allValues.filter(r => !r.is_mapped)
: filter === 'mapped' : filter === 'mapped'
? allValues.filter(r => r.is_mapped) ? allValues.filter(r => r.is_mapped)
: allValues : allValues
let rowFilterRe = null
let rowFilterError = false
if (rowFilter) {
try { rowFilterRe = new RegExp(rowFilter, 'i') } catch { rowFilterError = true }
}
const filteredRows = rowFilterRe
? tabFiltered.filter(r => rowFilterRe.test(displayValue(r.extracted_value)))
: tabFiltered
function toggleSort(col) { function toggleSort(col) {
setSortBy(s => { setSortBy(s => {
if (s?.col === col) return { col, dir: s.dir === 'asc' ? 'desc' : 'asc' } if (s?.col === col) return { col, dir: s.dir === 'asc' ? 'desc' : 'asc' }
@ -226,12 +108,7 @@ export default function Mappings({ source }) {
function setCellValue(extractedValue, col, value) { function setCellValue(extractedValue, col, value) {
const k = valueKey(extractedValue) const k = valueKey(extractedValue)
const targets = selected.has(k) && selected.size > 1 ? [...selected] : [k] setDrafts(d => ({ ...d, [k]: { ...(d[k] || {}), [col]: value } }))
setDrafts(d => {
const next = { ...d }
for (const sk of targets) next[sk] = { ...(next[sk] || {}), [col]: value }
return next
})
} }
async function saveRow(row) { async function saveRow(row) {
@ -282,35 +159,6 @@ export default function Mappings({ source }) {
return drafts[k] && Object.keys(drafts[k]).length > 0 return drafts[k] && Object.keys(drafts[k]).length > 0
}) })
await Promise.all(dirty.map(row => saveRow(row))) await Promise.all(dirty.map(row => saveRow(row)))
setRowFilter('')
}
async function applyBulk() {
const output = Object.fromEntries(
Object.entries(bulkDraft).filter(([, v]) => v.trim())
)
if (Object.keys(output).length === 0) return
const rows = sortedRows(filteredRows).filter(r => selected.has(valueKey(r.extracted_value)))
await Promise.all(rows.map(async row => {
const k = valueKey(row.extracted_value)
const merged = { ...(row.is_mapped ? row.output : {}), ...output }
setSaving(s => ({ ...s, [k]: true }))
try {
if (row.is_mapped && row.mapping_id) {
const updated = await api.updateMapping(row.mapping_id, { output: merged })
setAllValues(av => av.map(x => valueKey(x.extracted_value) === k ? { ...x, output: updated.output } : x))
} else {
const created = await api.createMapping({ source_name: source, rule_name: row.rule_name, input_value: row.extracted_value, output: merged })
setAllValues(av => av.map(x => valueKey(x.extracted_value) === k ? { ...x, is_mapped: true, mapping_id: created.id, output: merged } : x))
}
} catch (err) {
alert(err.message)
} finally {
setSaving(s => ({ ...s, [k]: false }))
}
}))
setSelected(new Set())
setBulkDraft({})
} }
async function deleteRow(row) { async function deleteRow(row) {
@ -382,24 +230,6 @@ export default function Mappings({ source }) {
</div> </div>
)} )}
{selectedRule && (
<div className="relative">
<input
className={`text-xs font-mono border rounded px-2 py-1.5 w-44 focus:outline-none focus:border-blue-400 ${
rowFilterError ? 'border-red-400 bg-red-50' : rowFilter ? 'border-blue-300' : 'border-gray-200'
}`}
placeholder="filter regex…"
value={rowFilter}
onChange={e => setRowFilter(e.target.value)}
/>
{rowFilter && !rowFilterError && (
<span className="absolute right-2 top-1/2 -translate-y-1/2 text-xs text-gray-400">
{filteredRows.length}
</span>
)}
</div>
)}
{dirtyCount > 0 && ( {dirtyCount > 0 && (
<button <button
onClick={saveAllPending} onClick={saveAllPending}
@ -411,26 +241,13 @@ export default function Mappings({ source }) {
<div className="ml-auto flex items-center gap-2"> <div className="ml-auto flex items-center gap-2">
{selectedRule && ( {selectedRule && (
<button <a
onClick={async () => { href={api.exportMappingsUrl(source, selectedRule)}
try { download
const url = api.exportMappingsUrl(source, selectedRule)
const res = await fetch(url, { headers: authHeaders() })
if (!res.ok) throw new Error('Export failed')
const blob = await res.blob()
const a = document.createElement('a')
a.href = URL.createObjectURL(blob)
a.download = `mappings_${source}.tsv`
a.click()
URL.revokeObjectURL(a.href)
} catch (err) {
alert(err.message)
}
}}
className="text-sm px-3 py-1.5 border border-gray-200 rounded hover:bg-gray-50 text-gray-600" className="text-sm px-3 py-1.5 border border-gray-200 rounded hover:bg-gray-50 text-gray-600"
> >
Export TSV Export TSV
</button> </a>
)} )}
<label className={`text-sm px-3 py-1.5 border border-gray-200 rounded cursor-pointer hover:bg-gray-50 text-gray-600 ${importing ? 'opacity-50 pointer-events-none' : ''}`}> <label className={`text-sm px-3 py-1.5 border border-gray-200 rounded cursor-pointer hover:bg-gray-50 text-gray-600 ${importing ? 'opacity-50 pointer-events-none' : ''}`}>
{importing ? 'Importing…' : 'Import TSV'} {importing ? 'Importing…' : 'Import TSV'}
@ -452,49 +269,16 @@ export default function Mappings({ source }) {
)} )}
{selectedRule && !loading && allValues.length > 0 && ( {selectedRule && !loading && allValues.length > 0 && (
<div className="overflow-x-auto"> <div className="overflow-x-auto">
{/* Bulk assign bar */} {cols.map(col => (
{selected.size > 0 && ( <datalist key={col} id={`dl-${col}`}>
<div className="flex items-center gap-2 mb-2 p-2 bg-blue-50 border border-blue-200 rounded flex-wrap"> {[...(valuesByCol[col] || [])].sort().map(v => (
<span className="text-xs text-blue-700 font-medium whitespace-nowrap">{selected.size} selected</span> <option key={v} value={v} />
{cols.map(col => (
<AutocompleteInput
key={col}
className="border border-blue-300 rounded px-2 py-1 text-xs min-w-24 focus:outline-none focus:border-blue-500 bg-white"
placeholder={col}
value={bulkDraft[col] || ''}
onChange={v => setBulkDraft(d => ({ ...d, [col]: v }))}
suggestions={[...(valuesByCol[col] || [])].sort()}
/>
))} ))}
<button </datalist>
onClick={applyBulk} ))}
disabled={Object.values(bulkDraft).every(v => !v.trim())}
className="text-xs bg-blue-600 text-white px-3 py-1 rounded hover:bg-blue-700 disabled:opacity-40 whitespace-nowrap"
>
Apply to {selected.size}
</button>
<button
onClick={() => { setSelected(new Set()); setBulkDraft({}) }}
className="text-xs text-blue-400 hover:text-blue-600"
>
cancel
</button>
</div>
)}
<table className="w-full text-xs bg-white border border-gray-200 rounded"> <table className="w-full text-xs bg-white border border-gray-200 rounded">
<thead> <thead>
<tr className="text-left text-gray-400 border-b border-gray-100 bg-gray-50"> <tr className="text-left text-gray-400 border-b border-gray-100 bg-gray-50">
<th className="px-2 py-2 w-6">
<input
type="checkbox"
className="cursor-pointer"
checked={displayRows.length > 0 && displayRows.every(r => selected.has(valueKey(r.extracted_value)))}
onChange={e => {
if (e.target.checked) setSelected(new Set(displayRows.map(r => valueKey(r.extracted_value))))
else setSelected(new Set())
}}
/>
</th>
<SortHeader col="input_value" label="input_value" sortBy={sortBy} onSort={toggleSort} /> <SortHeader col="input_value" label="input_value" sortBy={sortBy} onSort={toggleSort} />
<SortHeader col="count" label="count" sortBy={sortBy} onSort={toggleSort} className="text-right" /> <SortHeader col="count" label="count" sortBy={sortBy} onSort={toggleSort} className="text-right" />
{existingCols.map(col => ( {existingCols.map(col => (
@ -513,7 +297,7 @@ export default function Mappings({ source }) {
<th className="px-2 py-2"> <th className="px-2 py-2">
<button <button
onClick={() => setExtraCols(ec => [...ec, ''])} onClick={() => setExtraCols(ec => [...ec, ''])}
className="text-gray-400 hover:text-gray-700 font-medium" className="text-gray-300 hover:text-gray-500"
title="Add column" title="Add column"
>+</button> >+</button>
</th> </th>
@ -524,29 +308,9 @@ export default function Mappings({ source }) {
<tbody> <tbody>
{displayRows.map(row => { {displayRows.map(row => {
const k = valueKey(row.extracted_value) const k = valueKey(row.extracted_value)
const rowIdx = displayRows.indexOf(row)
const isSaving = saving[k] const isSaving = saving[k]
const isSelected = selected.has(k)
const hasDraft = !!(drafts[k] && Object.keys(drafts[k]).length > 0) const hasDraft = !!(drafts[k] && Object.keys(drafts[k]).length > 0)
const rowBg = isSelected ? 'bg-blue-50' : hasDraft ? 'bg-blue-50' : row.is_mapped ? '' : 'bg-yellow-50' const rowBg = hasDraft ? 'bg-blue-50' : row.is_mapped ? '' : 'bg-yellow-50'
function handleRowClick(e) {
if (e.target.closest('input,button,a,select')) return
setSelected(s => { const n = new Set(s); n.has(k) ? n.delete(k) : n.add(k); return n })
setCursorKey(k)
}
function handleRowKeyDown(e) {
if (!e.shiftKey || (e.key !== 'ArrowDown' && e.key !== 'ArrowUp')) return
e.preventDefault()
const delta = e.key === 'ArrowDown' ? 1 : -1
const curIdx = cursorKey ? displayRows.findIndex(r => valueKey(r.extracted_value) === cursorKey) : rowIdx
const nextIdx = Math.max(0, Math.min(displayRows.length - 1, curIdx + delta))
const nextKey = valueKey(displayRows[nextIdx].extracted_value)
setSelected(s => new Set([...s, nextKey]))
setCursorKey(nextKey)
rowRefs.current[nextKey]?.focus()
}
const samples = row.sample const samples = row.sample
? (Array.isArray(row.sample) ? row.sample : [row.sample]) ? (Array.isArray(row.sample) ? row.sample : [row.sample])
: [] : []
@ -559,37 +323,19 @@ export default function Mappings({ source }) {
return ( return (
<> <>
<tr <tr key={k} className={`border-t border-gray-50 hover:bg-gray-50 ${rowBg}`}>
key={k}
ref={el => rowRefs.current[k] = el}
tabIndex={0}
className={`border-t border-gray-50 hover:bg-gray-50 cursor-pointer outline-none ${rowBg}`}
onClick={handleRowClick}
onKeyDown={handleRowKeyDown}
>
<td className="px-2 py-1.5">
<input
type="checkbox"
className="cursor-pointer"
checked={isSelected}
onChange={() => {
setSelected(s => { const n = new Set(s); n.has(k) ? n.delete(k) : n.add(k); return n })
setCursorKey(k)
}}
/>
</td>
<td className="px-3 py-1.5 font-mono text-gray-800 whitespace-nowrap">{displayValue(row.extracted_value)}</td> <td className="px-3 py-1.5 font-mono text-gray-800 whitespace-nowrap">{displayValue(row.extracted_value)}</td>
<td className="px-3 py-1.5 text-right text-gray-400">{row.record_count}</td> <td className="px-3 py-1.5 text-right text-gray-400">{row.record_count}</td>
{cols.map(col => ( {cols.map(col => (
<td key={col} className="px-3 py-1.5"> <td key={col} className="px-3 py-1.5">
<AutocompleteInput <input
list={`dl-${col}`}
className={`border rounded px-2 py-1 w-full min-w-24 focus:outline-none focus:border-blue-400 ${ className={`border rounded px-2 py-1 w-full min-w-24 focus:outline-none focus:border-blue-400 ${
hasDraft ? 'border-blue-300' : row.is_mapped ? 'border-gray-200' : 'border-yellow-300' hasDraft ? 'border-blue-300' : row.is_mapped ? 'border-gray-200' : 'border-yellow-300'
}`} }`}
value={cellVal(col)} value={cellVal(col)}
onChange={v => setCellValue(row.extracted_value, col, v)} onChange={e => setCellValue(row.extracted_value, col, e.target.value)}
onEnter={() => saveRow(row)} onKeyDown={e => e.key === 'Enter' && saveRow(row)}
suggestions={[...(valuesByCol[col] || [])].sort()}
/> />
</td> </td>
))} ))}
@ -627,7 +373,7 @@ export default function Mappings({ source }) {
const sampleCols = [...new Set(samples.flatMap(r => Object.keys(r)))] const sampleCols = [...new Set(samples.flatMap(r => Object.keys(r)))]
return ( return (
<tr key={`${k}-sample`} className="border-t border-gray-50 bg-gray-50"> <tr key={`${k}-sample`} className="border-t border-gray-50 bg-gray-50">
<td colSpan={3 + cols.length + 4} className="px-3 py-2"> <td colSpan={2 + cols.length + 4} className="px-3 py-2">
<table className="w-full text-xs border border-gray-100 rounded bg-white"> <table className="w-full text-xs border border-gray-100 rounded bg-white">
<thead> <thead>
<tr className="bg-gray-50 border-b border-gray-100"> <tr className="bg-gray-50 border-b border-gray-100">

View File

@ -1,503 +0,0 @@
import { useEffect, useRef, useState, useCallback } from 'react'
import { api } from '../api'
async function fetchAllRows(source) {
const res = await api.getViewData(source, 100000, 0)
return res.rows || []
}
let perspectivePromise = null
function loadPerspective() {
if (perspectivePromise) return perspectivePromise
perspectivePromise = (async () => {
if (!document.getElementById('psp-theme')) {
const link = document.createElement('link')
link.id = 'psp-theme'
link.rel = 'stylesheet'
link.crossOrigin = 'anonymous'
link.href = 'https://cdn.jsdelivr.net/npm/@perspective-dev/viewer/dist/css/themes.css'
document.head.appendChild(link)
}
const [{ default: perspective }] = await Promise.all([
import(/* @vite-ignore */ 'https://cdn.jsdelivr.net/npm/@perspective-dev/client@4.4.0/dist/cdn/perspective.js'),
import(/* @vite-ignore */ 'https://cdn.jsdelivr.net/npm/@perspective-dev/viewer@4.4.0/dist/cdn/perspective-viewer.js'),
import(/* @vite-ignore */ 'https://cdn.jsdelivr.net/npm/@perspective-dev/viewer-datagrid@4.4.0/dist/cdn/perspective-viewer-datagrid.js'),
import(/* @vite-ignore */ 'https://cdn.jsdelivr.net/npm/@perspective-dev/viewer-d3fc@4.4.0/dist/cdn/perspective-viewer-d3fc.js'),
])
return perspective
})()
return perspectivePromise
}
function formatVal(v, decimals = 2) {
if (v == null) return null
if (typeof v === 'number') {
if (v > 1e11 && v < 2e12) {
const d = new Date(v)
if (!isNaN(d)) return d.toISOString().slice(0, 10)
}
return v.toLocaleString(undefined, { minimumFractionDigits: decimals, maximumFractionDigits: decimals })
}
return String(v)
}
function normalize(v) {
if (v == null) return null
if (typeof v === 'number' && v > 1e11 && v < 2e12) return new Date(v).toISOString().slice(0, 10)
return String(v).trim()
}
function filterRowsByConfig(allRows, filters) {
if (!filters || filters.length === 0) return allRows
const knownFields = allRows.length > 0 ? new Set(Object.keys(allRows[0])) : new Set()
const applicable = filters.filter(([field]) => knownFields.has(field))
if (applicable.length === 0) return allRows
return allRows.filter(row =>
applicable.every(([field, op, value]) => {
const rawVal = row[field]
if (rawVal == null) return op === '!=' || op === 'not contains'
const a = normalize(rawVal)
const b = value != null ? String(value).trim() : ''
const aNum = parseFloat(a), bNum = parseFloat(b)
const numeric = !isNaN(aNum) && !isNaN(bNum)
switch (op) {
case '==': return a === b
case '!=': return a !== b
case '>': return numeric ? aNum > bNum : a > b
case '>=': return numeric ? aNum >= bNum : a >= b
case '<': return numeric ? aNum < bNum : a < b
case '<=': return numeric ? aNum <= bNum : a <= b
case 'contains': return a.toLowerCase().includes(b.toLowerCase())
case 'not contains': return !a.toLowerCase().includes(b.toLowerCase())
default: return true
}
})
)
}
const LAYOUT_KEY = (source) => `psp_layout_${source}`
const DEFAULT_PLUGIN_CONFIG = { edit_mode: 'SELECT_REGION' }
export default function Pivot({ source }) {
const viewerRef = useRef()
const workerRef = useRef()
const tableRef = useRef()
const allRowsRef = useRef([])
const expandDepthRef = useRef(null)
const [status, setStatus] = useState('idle')
const [error, setError] = useState('')
const [inspectedRows, setInspectedRows] = useState(null)
const [clickDetail, setClickDetail] = useState(null)
const [decimals, setDecimals] = useState(2)
// Named layouts
const [layouts, setLayouts] = useState([])
const [activeLayoutId, setActiveLayoutId] = useState(null)
const [saveAsName, setSaveAsName] = useState('')
const [showSaveAs, setShowSaveAs] = useState(false)
const [layoutMsg, setLayoutMsg] = useState('')
const flashMsg = (msg) => {
setLayoutMsg(msg)
setTimeout(() => setLayoutMsg(''), 2000)
}
const loadLayouts = useCallback(async () => {
if (!source) return
try {
const rows = await api.getPivotLayouts(source)
setLayouts(rows)
} catch {}
}, [source])
useEffect(() => {
if (!source) return
let cancelled = false
setInspectedRows(null)
setClickDetail(null)
setActiveLayoutId(null)
setShowSaveAs(false)
allRowsRef.current = []
loadLayouts()
async function init() {
setStatus('loading')
setError('')
try {
const [perspective, rows] = await Promise.all([
loadPerspective(),
fetchAllRows(source),
])
if (cancelled) return
if (!rows.length) { setStatus('noview'); return }
allRowsRef.current = rows
if (workerRef.current) { try { workerRef.current.terminate() } catch {} }
const worker = await perspective.worker()
if (cancelled) { worker.terminate(); return }
workerRef.current = worker
const table = await worker.table(rows, { name: source })
if (cancelled) return
tableRef.current = table
const viewer = viewerRef.current
viewer.addEventListener('perspective-click', async (e) => {
const detail = e.detail || {}
const { row, column_names } = detail
if (!row) return
const eventFilters = (detail.config || {}).filter || []
const config = await viewer.save()
// Without a group_by hierarchy there are no coordinate filters, so the
// query would return the entire dataset skip the inspector in that case
const hasHierarchy = (config.group_by || []).length > 0
if (!hasHierarchy) return
setClickDetail({ row, config, column_names, eventFilters })
// Use a Perspective view with the event filters + expressions so computed
// columns (split_by) are evaluated and filtered correctly
try {
const view = await tableRef.current.view({
filter: eventFilters,
expressions: config.expressions || [],
})
const data = await view.to_json()
await view.delete()
// Strip expression columns only show raw source columns
const exprNames = new Set(Object.keys(config.expressions || {}))
const cleaned = data.map(r =>
Object.fromEntries(Object.entries(r).filter(([k]) => !exprNames.has(k)))
)
setInspectedRows(cleaned)
} catch {
setInspectedRows(filterRowsByConfig(allRowsRef.current, eventFilters))
}
})
await viewer.load(worker)
const plugin = await viewer.getPlugin()
const savedLayout = localStorage.getItem(LAYOUT_KEY(source))
if (savedLayout) {
const parsed = JSON.parse(savedLayout)
await viewer.restore(parsed)
await plugin.restore(parsed.plugin_config || DEFAULT_PLUGIN_CONFIG)
if (parsed.expand_depth != null) await applyExpandDepth(viewer, parsed.expand_depth)
} else {
await viewer.restore({ table: source, settings: false, plugin_config: DEFAULT_PLUGIN_CONFIG })
await plugin.restore(DEFAULT_PLUGIN_CONFIG)
}
await viewer.flush()
setStatus('ready')
} catch (err) {
if (!cancelled) { setStatus('error'); setError(err.message) }
}
}
init()
return () => { cancelled = true }
}, [source])
async function applyExpandDepth(viewer, depth) {
if (depth == null) return
const view = await viewer.getView()
await view.set_depth(depth)
const plugin = await viewer.getPlugin()
await plugin.draw(view)
expandDepthRef.current = depth
}
async function applyLayout(layout) {
const viewer = viewerRef.current
if (!viewer) return
await viewer.restore(layout.config)
if (layout.config.plugin_config) {
const plugin = await viewer.getPlugin()
await plugin.restore(layout.config.plugin_config)
}
await applyExpandDepth(viewer, layout.config.expand_depth ?? null)
setActiveLayoutId(layout.id)
// also persist to localStorage so it survives refresh
localStorage.setItem(LAYOUT_KEY(source), JSON.stringify(layout.config))
}
async function captureConfig() {
const viewer = viewerRef.current
if (!viewer) return null
const plugin = await viewer.getPlugin()
const [viewerConfig, pluginConfig] = await Promise.all([viewer.save(), plugin.save()])
return { ...viewerConfig, plugin_config: pluginConfig, expand_depth: expandDepthRef.current }
}
async function handleSaveOver() {
const layout = layouts.find(l => l.id === activeLayoutId)
if (!layout) return
const config = await captureConfig()
if (!config) return
try {
const saved = await api.savePivotLayout(source, layout.layout_name, config)
localStorage.setItem(LAYOUT_KEY(source), JSON.stringify(config))
await loadLayouts()
setActiveLayoutId(saved.id)
flashMsg('Saved!')
} catch (err) {
flashMsg(err.message)
}
}
async function handleSaveAs() {
const name = saveAsName.trim()
if (!name) return
const config = await captureConfig()
if (!config) return
try {
const saved = await api.savePivotLayout(source, name, config)
localStorage.setItem(LAYOUT_KEY(source), JSON.stringify(config))
await loadLayouts()
setActiveLayoutId(saved.id)
setShowSaveAs(false)
setSaveAsName('')
flashMsg('Saved!')
} catch (err) {
flashMsg(err.message)
}
}
async function handleDelete(layout, e) {
e.stopPropagation()
try {
await api.deletePivotLayout(source, layout.id)
if (activeLayoutId === layout.id) setActiveLayoutId(null)
await loadLayouts()
flashMsg('Deleted')
} catch (err) {
flashMsg(err.message)
}
}
function handleResetToDefault() {
const viewer = viewerRef.current
if (!viewer) return
localStorage.removeItem(LAYOUT_KEY(source))
setActiveLayoutId(null)
viewer.restore({ table: source, settings: true, plugin_config: DEFAULT_PLUGIN_CONFIG })
}
if (!source) return <div className="p-6 text-sm text-gray-400">Select a source first.</div>
const cols = inspectedRows?.length ? Object.keys(inspectedRows[0]) : []
const groupBy = clickDetail?.config?.group_by || []
const splitBy = clickDetail?.config?.split_by || []
const coordFields = new Set([...groupBy, ...splitBy])
const coordMap = Object.fromEntries(
(clickDetail?.eventFilters || [])
.filter(([f, op]) => coordFields.has(f) && op === '==')
.map(([f, , v]) => [f, v])
)
const cellCoords = [...groupBy, ...splitBy].map(f => coordMap[f]).filter(Boolean)
const splitVals = splitBy.map(f => coordMap[f]).filter(Boolean)
const metrics = clickDetail?.column_names || []
const cellKey = splitVals.length > 0 && metrics.length > 0
? [...splitVals, ...metrics].join('|')
: null
return (
<div className="w-full h-full flex flex-col">
{/* Layout toolbar */}
<div className="flex items-center gap-2 px-3 py-1.5 bg-white border-b border-gray-200 flex-shrink-0">
<span className="text-xs text-gray-400 uppercase tracking-wide mr-1">Layouts</span>
{layouts.map(l => (
<div key={l.id}
onClick={() => applyLayout(l)}
className={`flex items-center gap-1 text-xs rounded px-2 py-0.5 cursor-pointer border transition-colors
${activeLayoutId === l.id
? 'bg-blue-50 border-blue-300 text-blue-700'
: 'bg-white border-gray-200 text-gray-600 hover:border-gray-400'}`}>
{l.layout_name}
<button
onClick={(e) => handleDelete(l, e)}
className="text-gray-300 hover:text-red-400 leading-none ml-0.5 text-sm">×</button>
</div>
))}
{activeLayoutId !== null && !showSaveAs && (
<button onClick={handleSaveOver}
className="text-xs text-blue-500 hover:text-blue-700 border border-blue-200 rounded px-2 py-0.5">
Save
</button>
)}
{showSaveAs ? (
<div className="flex items-center gap-1">
<input
autoFocus
value={saveAsName}
onChange={e => setSaveAsName(e.target.value)}
onKeyDown={e => { if (e.key === 'Enter') handleSaveAs(); if (e.key === 'Escape') { setShowSaveAs(false); setSaveAsName('') } }}
placeholder="Layout name…"
className="text-xs border border-gray-300 rounded px-2 py-0.5 w-36 focus:outline-none focus:border-blue-400"
/>
<button onClick={handleSaveAs} className="text-xs text-blue-600 hover:text-blue-800 px-1">Save</button>
<button onClick={() => { setShowSaveAs(false); setSaveAsName('') }} className="text-xs text-gray-400 hover:text-gray-600 px-1">Cancel</button>
</div>
) : (
<button
onClick={() => setShowSaveAs(true)}
className="text-xs text-gray-400 hover:text-gray-600 border border-dashed border-gray-200 rounded px-2 py-0.5">
+ Save as
</button>
)}
{activeLayoutId !== null && (
<button onClick={handleResetToDefault}
className="text-xs text-gray-300 hover:text-gray-500 ml-1">
reset
</button>
)}
{layoutMsg && <span className="text-xs text-green-600 ml-1">{layoutMsg}</span>}
<div className="ml-auto flex items-center gap-1">
<span className="text-xs text-gray-400">depth:</span>
{[0, 1, 2, 3].map(d => (
<button key={d} onClick={async () => {
const v = viewerRef.current; if (!v) return
const view = await v.getView()
await view.set_depth(d)
const p = await v.getPlugin()
await p.draw(view)
expandDepthRef.current = d
}} className="text-xs border border-gray-200 rounded px-1.5 py-0.5 text-gray-500 hover:border-gray-400">
{d}
</button>
))}
</div>
</div>
{/* Pivot + inspector */}
<div className="relative flex-1 flex min-h-0">
<div className="relative flex-1">
{status === 'loading' && (
<div className="absolute inset-0 flex items-center justify-center z-10 bg-gray-50">
<p className="text-sm text-gray-400">Loading</p>
</div>
)}
{status === 'error' && (
<div className="absolute inset-0 flex items-center justify-center z-10 bg-gray-50">
<p className="text-sm text-red-500">Error: {error}</p>
</div>
)}
{status === 'noview' && (
<div className="absolute inset-0 flex items-center justify-center z-10 bg-gray-50">
<p className="text-sm text-gray-400">No view data generate a view and transform records first.</p>
</div>
)}
<perspective-viewer
ref={viewerRef}
style={{ position: 'absolute', top: 0, left: 0, right: 0, bottom: 0 }}
/>
</div>
{inspectedRows && clickDetail && (
<div className="w-96 border-l border-gray-200 bg-white flex flex-col overflow-hidden flex-shrink-0">
<div className="flex items-center justify-between px-3 py-2 border-b border-gray-100">
<span className="text-xs font-semibold text-gray-600 uppercase tracking-wide">
{inspectedRows.length} row{inspectedRows.length !== 1 ? 's' : ''}
</span>
<div className="flex items-center gap-2">
<div className="flex items-center gap-0.5">
<button onClick={() => setDecimals(d => Math.max(0, d - 1))}
className="text-xs text-gray-400 hover:text-gray-600 w-4 text-center"></button>
<span className="text-xs text-gray-400 w-4 text-center">{decimals}</span>
<button onClick={() => setDecimals(d => Math.min(8, d + 1))}
className="text-xs text-gray-400 hover:text-gray-600 w-4 text-center">+</button>
</div>
<button onClick={() => { setInspectedRows(null); setClickDetail(null) }}
className="text-gray-300 hover:text-gray-500 leading-none text-lg">×</button>
</div>
</div>
<div className="flex-1 overflow-y-auto">
{/* Cell coordinates */}
<div className="px-3 py-2 border-b border-gray-100">
<div className="text-xs text-gray-400 uppercase tracking-wide mb-1">
{[...groupBy, ...splitBy].join(' ') || clickDetail.column_names?.join(', ') || 'Cell'}
</div>
{cellCoords.length > 0 && (
<div className="text-xs text-gray-700 font-mono font-semibold">
{cellCoords.join(' ')}
</div>
)}
{Object.entries(clickDetail.row)
.filter(([k, v]) => k !== '__ROW_PATH__' && v != null)
.map(([k, v]) => {
const isSelected = cellKey != null && k === cellKey
return (
<div key={k} className={`flex justify-between py-0.5 gap-2 ${isSelected ? 'font-semibold' : ''}`}>
<span className={`text-xs font-mono shrink-0 ${isSelected ? 'text-gray-700' : 'text-gray-400'}`}>{k}</span>
<span className={`text-xs font-mono text-right ${isSelected ? 'text-blue-600' : 'text-gray-700'}`}>{formatVal(v, decimals)}</span>
</div>
)
})}
</div>
{/* User-set filters */}
{(() => {
const userFilters = (clickDetail.eventFilters || []).filter(([f]) => !coordFields.has(f))
return userFilters.length > 0 ? (
<div className="px-3 py-2 border-b border-gray-100">
<div className="text-xs text-gray-400 uppercase tracking-wide mb-1">Filters</div>
{userFilters.map((f, i) => (
<div key={i} className="text-xs text-gray-500 py-0.5 font-mono">{f.join(' ')}</div>
))}
</div>
) : null
})()}
{/* Underlying rows */}
{inspectedRows.length > 0 && (
<div className="overflow-auto">
<table className="w-full text-xs">
<thead>
<tr className="text-left text-gray-400 border-b border-gray-100 bg-gray-50 sticky top-0">
{cols.map(c => (
<th key={c} className="px-2 py-1 font-medium whitespace-nowrap">{c}</th>
))}
</tr>
</thead>
<tbody>
{inspectedRows.map((row, i) => (
<tr key={i} className="border-t border-gray-50 hover:bg-gray-50">
{cols.map(c => {
const f = formatVal(row[c], decimals)
return (
<td key={c} className="px-2 py-1 font-mono whitespace-nowrap text-gray-700 max-w-40 truncate">
{f == null ? <span className="text-gray-300"></span> : f}
</td>
)
})}
</tr>
))}
</tbody>
</table>
</div>
)}
</div>
</div>
)}
</div>
</div>
)
}

View File

@ -1,4 +1,4 @@
import { useState, useEffect, useRef } from 'react' import { useState, useEffect } from 'react'
import { api } from '../api' import { api } from '../api'
const DATE_RE = /^\d{4}-\d{2}-\d{2}(T[\d:.Z+-]+)?$/ const DATE_RE = /^\d{4}-\d{2}-\d{2}(T[\d:.Z+-]+)?$/
@ -20,79 +20,47 @@ function formatVal(val) {
export default function Records({ source }) { export default function Records({ source }) {
const [rows, setRows] = useState([]) const [rows, setRows] = useState([])
const [cols, setCols] = useState([])
const [exists, setExists] = useState(null) const [exists, setExists] = useState(null)
const [offset, setOffset] = useState(0) const [offset, setOffset] = useState(0)
const [loading, setLoading] = useState(false) const [loading, setLoading] = useState(false)
const [viewError, setViewError] = useState(null)
const [sort, setSort] = useState({ col: null, dir: 'asc' }) const [sort, setSort] = useState({ col: null, dir: 'asc' })
const [filters, setFilters] = useState([])
const debounceRef = useRef(null)
const LIMIT = 100 const LIMIT = 100
useEffect(() => { useEffect(() => {
if (!source) return if (!source) return
setOffset(0) setOffset(0)
setSort({ col: null, dir: 'asc' }) setSort({ col: null, dir: 'asc' })
setFilters([]) load(0, null, 'asc')
setViewError(null)
load(0, null, 'asc', [])
}, [source]) }, [source])
async function load(off, col, dir, filt) { async function load(off, col, dir) {
setLoading(true) setLoading(true)
try { try {
const active = (filt || []).filter(f => f.col && f.pattern) const res = await api.getViewData(source, LIMIT, off, col, dir)
const res = await api.getViewData(source, LIMIT, off, col, dir, active)
setExists(res.exists) setExists(res.exists)
setRows(res.rows) setRows(res.rows)
if (res.rows.length > 0 && cols.length === 0) setCols(Object.keys(res.rows[0]))
else if (res.rows.length > 0) setCols(Object.keys(res.rows[0]))
} catch (err) { } catch (err) {
setViewError(err.message) console.error(err)
} finally { } finally {
setLoading(false) setLoading(false)
} }
} }
function triggerLoad(off, col, dir, filt) {
clearTimeout(debounceRef.current)
debounceRef.current = setTimeout(() => load(off, col, dir, filt), 350)
}
function toggleSort(col) { function toggleSort(col) {
const next = sort.col === col const next = sort.col === col
? { col, dir: sort.dir === 'asc' ? 'desc' : 'asc' } ? { col, dir: sort.dir === 'asc' ? 'desc' : 'asc' }
: { col, dir: 'asc' } : { col, dir: 'asc' }
setSort(next) setSort(next)
setOffset(0) setOffset(0)
load(0, next.col, next.dir, filters) load(0, next.col, next.dir)
} }
function addFilter() { function prev() { const o = Math.max(0, offset - LIMIT); setOffset(o); load(o, sort.col, sort.dir) }
setFilters(f => [...f, { col: cols[0] || '', pattern: '' }]) function next() { const o = offset + LIMIT; setOffset(o); load(o, sort.col, sort.dir) }
}
function removeFilter(i) {
const next = filters.filter((_, idx) => idx !== i)
setFilters(next)
setOffset(0)
load(0, sort.col, sort.dir, next)
}
function updateFilter(i, key, val) {
const next = filters.map((f, idx) => idx === i ? { ...f, [key]: val } : f)
setFilters(next)
setOffset(0)
triggerLoad(0, sort.col, sort.dir, next)
}
function prev() { const o = Math.max(0, offset - LIMIT); setOffset(o); load(o, sort.col, sort.dir, filters) }
function next() { const o = offset + LIMIT; setOffset(o); load(o, sort.col, sort.dir, filters) }
if (!source) return <div className="p-6 text-sm text-gray-400">Select a source first.</div> if (!source) return <div className="p-6 text-sm text-gray-400">Select a source first.</div>
const displayCols = rows.length > 0 ? Object.keys(rows[0]) : cols const cols = rows.length > 0 ? Object.keys(rows[0]) : []
return ( return (
<div className="p-6"> <div className="p-6">
@ -103,54 +71,8 @@ export default function Records({ source }) {
)} )}
</div> </div>
{/* Filter bar */}
{exists !== false && displayCols.length > 0 && (
<div className="mb-4 flex flex-wrap gap-2 items-center">
{filters.map((f, i) => (
<div key={i} className="flex items-center gap-1 bg-white border border-gray-200 rounded px-2 py-1">
<select
className="text-xs text-gray-600 border-0 focus:outline-none bg-transparent"
value={f.col}
onChange={e => updateFilter(i, 'col', e.target.value)}
>
{displayCols.map(c => <option key={c} value={c}>{c}</option>)}
</select>
<span className="text-xs text-gray-300 mx-0.5">~*</span>
<input
className="text-xs font-mono border-0 focus:outline-none w-36 bg-transparent"
placeholder="regex…"
value={f.pattern}
onChange={e => updateFilter(i, 'pattern', e.target.value)}
/>
<button
onClick={() => removeFilter(i)}
className="text-gray-300 hover:text-gray-500 ml-1 leading-none"
>×</button>
</div>
))}
<button
onClick={addFilter}
className="text-xs text-gray-400 hover:text-gray-600 border border-dashed border-gray-200 rounded px-2 py-1"
>
+ filter
</button>
{filters.length > 0 && (
<button
onClick={() => { setFilters([]); setOffset(0); load(0, sort.col, sort.dir, []) }}
className="text-xs text-gray-400 hover:text-red-500"
>
clear
</button>
)}
</div>
)}
{loading && <p className="text-sm text-gray-400">Loading</p>} {loading && <p className="text-sm text-gray-400">Loading</p>}
{!loading && viewError && (
<p className="text-sm text-red-500">View error: {viewError} check field types in Sources.</p>
)}
{!loading && exists === false && ( {!loading && exists === false && (
<p className="text-sm text-gray-400"> <p className="text-sm text-gray-400">
No view generated yet. Go to <span className="font-medium text-gray-600">Sources</span>, check fields as <span className="font-medium text-gray-600">In view</span>, then click <span className="font-medium text-gray-600">Generate view</span>. No view generated yet. Go to <span className="font-medium text-gray-600">Sources</span>, check fields as <span className="font-medium text-gray-600">In view</span>, then click <span className="font-medium text-gray-600">Generate view</span>.
@ -158,9 +80,7 @@ export default function Records({ source }) {
)} )}
{!loading && exists && rows.length === 0 && ( {!loading && exists && rows.length === 0 && (
<p className="text-sm text-gray-400"> <p className="text-sm text-gray-400">View exists but no transformed records yet. Import data and run a transform first.</p>
{filters.some(f => f.col && f.pattern) ? 'No records match the current filters.' : 'View exists but no transformed records yet. Import data and run a transform first.'}
</p>
)} )}
{!loading && exists && rows.length > 0 && ( {!loading && exists && rows.length > 0 && (
@ -169,7 +89,7 @@ export default function Records({ source }) {
<table className="w-full text-sm"> <table className="w-full text-sm">
<thead> <thead>
<tr className="text-left text-xs text-gray-400 border-b border-gray-100 bg-gray-50"> <tr className="text-left text-xs text-gray-400 border-b border-gray-100 bg-gray-50">
{displayCols.map(col => { {cols.map(col => {
const active = sort.col === col const active = sort.col === col
return ( return (
<th <th
@ -189,7 +109,7 @@ export default function Records({ source }) {
<tbody> <tbody>
{rows.map((row, i) => ( {rows.map((row, i) => (
<tr key={i} className="border-t border-gray-50 hover:bg-gray-50"> <tr key={i} className="border-t border-gray-50 hover:bg-gray-50">
{displayCols.map((col, j) => { {cols.map((col, j) => {
const formatted = formatVal(row[col]) const formatted = formatVal(row[col])
return ( return (
<td key={j} className="px-3 py-2 text-xs text-gray-600 whitespace-nowrap max-w-48 truncate"> <td key={j} className="px-3 py-2 text-xs text-gray-600 whitespace-nowrap max-w-48 truncate">

View File

@ -1,214 +0,0 @@
import { useState, useRef } from 'react'
import { api } from '../api'
export default function Remap() {
const [search, setSearch] = useState('')
const [results, setResults] = useState(null)
const [searching, setSearching] = useState(false)
const [selected, setSelected] = useState(null) // { col, val }
const [matches, setMatches] = useState(null) // individual mappings
const [loadingMatches, setLoadingMatches] = useState(false)
const [toVal, setToVal] = useState('')
const [applying, setApplying] = useState(false)
const [msg, setMsg] = useState(null) // { text, ok }
const searchRef = useRef()
async function handleSearch(e) {
e.preventDefault()
const q = search.trim()
if (!q) return
setSearching(true)
setResults(null)
setSelected(null)
setMatches(null)
setMsg(null)
try {
const rows = await api.searchMappingOutputs(q)
setResults(rows)
} catch (err) {
setMsg({ text: err.message, ok: false })
} finally {
setSearching(false)
}
}
async function handleSelect(row) {
setSelected(row)
setToVal(row.val)
setMatches(null)
setMsg(null)
setLoadingMatches(true)
try {
const rows = await api.getMappingsByOutputField(row.col, row.val)
setMatches(rows)
} catch (err) {
setMsg({ text: err.message, ok: false })
} finally {
setLoadingMatches(false)
}
}
async function handleApply() {
if (!selected || !toVal.trim() || toVal === selected.val) return
setApplying(true)
setMsg(null)
try {
const { updated } = await api.remapOutputField(selected.col, selected.val, toVal.trim())
setMsg({ text: `Updated ${updated} mapping${updated !== 1 ? 's' : ''}.`, ok: true })
// Refresh match list to show new values
const rows = await api.getMappingsByOutputField(selected.col, toVal.trim())
setMatches(rows)
setSelected({ ...selected, val: toVal.trim() })
// Re-run search to refresh counts
const refreshed = await api.searchMappingOutputs(search.trim())
setResults(refreshed)
} catch (err) {
setMsg({ text: err.message, ok: false })
} finally {
setApplying(false)
}
}
return (
<div className="p-6 max-w-4xl">
<h1 className="text-base font-semibold text-gray-800 mb-4">Remap Output Values</h1>
{/* Search */}
<form onSubmit={handleSearch} className="flex items-center gap-2 mb-5">
<input
ref={searchRef}
value={search}
onChange={e => setSearch(e.target.value)}
placeholder="Search output values…"
className="text-sm border border-gray-300 rounded px-3 py-1.5 w-72 focus:outline-none focus:border-blue-400"
/>
<button type="submit" disabled={searching}
className="text-sm bg-blue-600 text-white rounded px-3 py-1.5 hover:bg-blue-700 disabled:opacity-50">
{searching ? 'Searching…' : 'Search'}
</button>
</form>
{/* Search results */}
{results !== null && (
<div className="mb-6">
{results.length === 0 ? (
<p className="text-sm text-gray-400">No matching output values found.</p>
) : (
<>
<div className="text-xs text-gray-400 uppercase tracking-wide mb-1">
{results.length} result{results.length !== 1 ? 's' : ''} click one to remap
</div>
<table className="w-full text-sm border border-gray-200 rounded overflow-hidden">
<thead>
<tr className="bg-gray-50 text-left text-xs text-gray-400 uppercase tracking-wide">
<th className="px-3 py-2">Field</th>
<th className="px-3 py-2">Value</th>
<th className="px-3 py-2 text-right">Mappings</th>
</tr>
</thead>
<tbody>
{results.map((r, i) => {
const isActive = selected?.col === r.col && selected?.val === r.val
return (
<tr key={i}
onClick={() => handleSelect(r)}
className={`border-t border-gray-100 cursor-pointer transition-colors
${isActive ? 'bg-blue-50' : 'hover:bg-gray-50'}`}>
<td className="px-3 py-2 font-mono text-gray-500">{r.col}</td>
<td className="px-3 py-2 font-mono text-gray-800">{r.val}</td>
<td className="px-3 py-2 text-right text-gray-400">{r.mapping_count}</td>
</tr>
)
})}
</tbody>
</table>
</>
)}
</div>
)}
{/* Remap panel */}
{selected && (
<div className="border border-gray-200 rounded p-4 mb-6 bg-white">
<div className="text-xs text-gray-400 uppercase tracking-wide mb-3">
Remap <span className="font-mono text-gray-600">{selected.col}</span>
</div>
<div className="flex items-center gap-3 mb-4">
<div className="flex-1">
<div className="text-xs text-gray-400 mb-1">From</div>
<div className="text-sm font-mono bg-gray-50 border border-gray-200 rounded px-3 py-1.5 text-gray-700">
{selected.val}
</div>
</div>
<div className="text-gray-300 mt-4"></div>
<div className="flex-1">
<div className="text-xs text-gray-400 mb-1">To</div>
<input
value={toVal}
onChange={e => setToVal(e.target.value)}
onKeyDown={e => e.key === 'Enter' && handleApply()}
className="w-full text-sm font-mono border border-gray-300 rounded px-3 py-1.5 focus:outline-none focus:border-blue-400"
/>
</div>
<div className="mt-4">
<button
onClick={handleApply}
disabled={applying || !toVal.trim() || toVal.trim() === selected.val}
className="text-sm bg-blue-600 text-white rounded px-3 py-1.5 hover:bg-blue-700 disabled:opacity-40 whitespace-nowrap">
{applying ? 'Applying…' : `Apply to all ${matches?.length ?? '…'}`}
</button>
</div>
</div>
{msg && (
<div className={`text-sm mb-3 ${msg.ok ? 'text-green-600' : 'text-red-500'}`}>
{msg.text}
</div>
)}
{/* Affected mappings */}
{loadingMatches ? (
<p className="text-xs text-gray-400">Loading</p>
) : matches && matches.length > 0 && (
<div>
<div className="text-xs text-gray-400 uppercase tracking-wide mb-1">
Affected mappings
</div>
<table className="w-full text-xs border border-gray-100 rounded overflow-hidden">
<thead>
<tr className="bg-gray-50 text-left text-gray-400">
<th className="px-2 py-1">Source</th>
<th className="px-2 py-1">Rule</th>
<th className="px-2 py-1">Input</th>
<th className="px-2 py-1">Output</th>
</tr>
</thead>
<tbody>
{matches.map(m => (
<tr key={m.id} className="border-t border-gray-50">
<td className="px-2 py-1 font-mono text-gray-500">{m.source_name}</td>
<td className="px-2 py-1 font-mono text-gray-500">{m.rule_name}</td>
<td className="px-2 py-1 font-mono text-gray-700">
{typeof m.input_value === 'string' ? m.input_value : JSON.stringify(m.input_value)}
</td>
<td className="px-2 py-1 font-mono text-gray-700">
{Object.entries(m.output).map(([k, v]) => (
<span key={k} className={k === selected.col ? 'text-blue-600 font-semibold' : ''}>
{k}: {v}{' '}
</span>
))}
</td>
</tr>
))}
</tbody>
</table>
</div>
)}
</div>
)}
</div>
)
}

View File

@ -1,42 +1,12 @@
import { useState, useEffect, useRef } from 'react' import { useState, useEffect, useRef } from 'react'
import { useSearchParams } from 'react-router-dom'
import { api } from '../api' import { api } from '../api'
const FIELD_TYPES = ['text', 'numeric', 'date'] const FIELD_TYPES = ['text', 'numeric', 'date']
function SampleTable({ rows }) {
if (!rows || rows.length === 0) return null
const cols = Object.keys(rows[0])
return (
<div className="overflow-auto border border-gray-100 rounded bg-gray-50 max-h-36">
<table className="text-xs w-full">
<thead>
<tr className="text-left text-gray-400 border-b border-gray-100 bg-gray-50 sticky top-0">
{cols.map(c => <th key={c} className="px-2 py-1 font-medium whitespace-nowrap">{c}</th>)}
</tr>
</thead>
<tbody>
{rows.map((row, i) => (
<tr key={i} className="border-t border-gray-100">
{cols.map(c => (
<td key={c} className="px-2 py-1 whitespace-nowrap text-gray-600 max-w-32 truncate font-mono">
{row[c] == null ? <span className="text-gray-300"></span> : String(row[c])}
</td>
))}
</tr>
))}
</tbody>
</table>
</div>
)
}
export default function Sources({ source, sources, setSources, setSource }) { export default function Sources({ source, sources, setSources, setSource }) {
const [constraintFields, setConstraintFields] = useState('') const [dedup, setDedup] = useState('')
const [globalPicklist, setGlobalPicklist] = useState(true)
const [schemaFields, setSchemaFields] = useState([]) const [schemaFields, setSchemaFields] = useState([])
const [stats, setStats] = useState(null) const [stats, setStats] = useState(null)
const [sampleRows, setSampleRows] = useState([])
const [saving, setSaving] = useState(false) const [saving, setSaving] = useState(false)
const [reprocessing, setReprocessing] = useState(false) const [reprocessing, setReprocessing] = useState(false)
const [generating, setGenerating] = useState(false) const [generating, setGenerating] = useState(false)
@ -44,38 +14,25 @@ export default function Sources({ source, sources, setSources, setSource }) {
const [error, setError] = useState('') const [error, setError] = useState('')
const [viewName, setViewName] = useState('') const [viewName, setViewName] = useState('')
const [availableFields, setAvailableFields] = useState([]) const [availableFields, setAvailableFields] = useState([])
const [fieldSort, setFieldSort] = useState({ col: 'key', dir: 'asc' })
const [creating, setCreating] = useState(false) const [creating, setCreating] = useState(false)
const [form, setForm] = useState({ name: '', constraint_fields: '', fields: [], schema: [], importSample: true }) const [form, setForm] = useState({ name: '', dedup_fields: '', fields: [], schema: [] })
const [createError, setCreateError] = useState('') const [createError, setCreateError] = useState('')
const [createLoading, setCreateLoading] = useState(false) const [createLoading, setCreateLoading] = useState(false)
const [csvFileName, setCsvFileName] = useState('')
const fileRef = useRef() const fileRef = useRef()
const [searchParams, setSearchParams] = useSearchParams()
const sourceObj = sources.find(s => s.name === source) const sourceObj = sources.find(s => s.name === source)
useEffect(() => {
if (searchParams.get('new') === '1') {
setCreating(true)
setSearchParams({})
}
}, [searchParams])
useEffect(() => { useEffect(() => {
if (!sourceObj) return if (!sourceObj) return
setConstraintFields(sourceObj.constraint_fields?.join(', ') || '') setDedup(sourceObj.dedup_fields?.join(', ') || '')
setGlobalPicklist(sourceObj.global_picklist !== false)
setSchemaFields((sourceObj.config?.fields || []).map((f, i) => ({ seq: i + 1, ...f }))) setSchemaFields((sourceObj.config?.fields || []).map((f, i) => ({ seq: i + 1, ...f })))
setViewName(sourceObj.config?.fields?.length ? `dfv.${sourceObj.name}` : '') setViewName(sourceObj.config?.fields?.length ? `dfv.${sourceObj.name}` : '')
setResult('') setResult('')
setError('') setError('')
setStats(null) setStats(null)
setAvailableFields([]) setAvailableFields([])
setSampleRows([])
api.getStats(sourceObj.name).then(setStats).catch(() => {}) api.getStats(sourceObj.name).then(setStats).catch(() => {})
api.getFields(sourceObj.name).then(setAvailableFields).catch(() => {}) api.getFields(sourceObj.name).then(setAvailableFields).catch(() => {})
api.getRecords(sourceObj.name, 50).then(rows => setSampleRows(rows.map(r => r.data).filter(Boolean))).catch(() => {})
}, [source, sourceObj?.name]) }, [source, sourceObj?.name])
async function handleSave(e) { async function handleSave(e) {
@ -83,14 +40,10 @@ export default function Sources({ source, sources, setSources, setSource }) {
setSaving(true) setSaving(true)
setError('') setError('')
try { try {
const constraint_fields = constraintFields.split(',').map(s => s.trim()).filter(Boolean) const dedup_fields = dedup.split(',').map(s => s.trim()).filter(Boolean)
const fields = [...schemaFields.filter(f => f.name)].sort((a, b) => (a.seq ?? 0) - (b.seq ?? 0)) const fields = [...schemaFields.filter(f => f.name)].sort((a, b) => (a.seq ?? 0) - (b.seq ?? 0))
const config = { ...(sourceObj.config || {}), fields } const config = { ...(sourceObj.config || {}), fields }
await api.updateSource(sourceObj.name, { constraint_fields, config, global_picklist: globalPicklist }) await api.updateSource(sourceObj.name, { dedup_fields, config })
if (fields.length > 0) {
const res = await api.generateView(sourceObj.name)
if (res.success) setViewName(res.view)
}
const updated = await api.getSources() const updated = await api.getSources()
setSources(updated) setSources(updated)
setResult('Saved.') setResult('Saved.')
@ -106,10 +59,10 @@ export default function Sources({ source, sources, setSources, setSource }) {
setResult('') setResult('')
setError('') setError('')
try { try {
const constraint_fields = constraintFields.split(',').map(s => s.trim()).filter(Boolean) const dedup_fields = dedup.split(',').map(s => s.trim()).filter(Boolean)
const fields = [...schemaFields.filter(f => f.name)].sort((a, b) => (a.seq ?? 0) - (b.seq ?? 0)) const fields = [...schemaFields.filter(f => f.name)].sort((a, b) => (a.seq ?? 0) - (b.seq ?? 0))
const config = { ...(sourceObj.config || {}), fields } const config = { ...(sourceObj.config || {}), fields }
await api.updateSource(sourceObj.name, { constraint_fields, config, global_picklist: globalPicklist }) await api.updateSource(sourceObj.name, { dedup_fields, config })
const res = await api.generateView(sourceObj.name) const res = await api.generateView(sourceObj.name)
if (res.success) { if (res.success) {
setViewName(res.view) setViewName(res.view)
@ -156,15 +109,13 @@ export default function Sources({ source, sources, setSources, setSource }) {
async function handleSuggest(e) { async function handleSuggest(e) {
const file = e.target.files[0] const file = e.target.files[0]
if (!file) return if (!file) return
setCsvFileName(file.name)
try { try {
const suggestion = await api.suggestSource(file) const suggestion = await api.suggestSource(file)
setForm(f => ({ setForm(f => ({
...f, ...f,
fields: suggestion.fields, fields: suggestion.fields,
constraint_fields: '', dedup_fields: '',
schema: suggestion.fields.map(f => ({ name: f.name, type: f.type, seq: suggestion.fields.indexOf(f) + 1 })), schema: suggestion.fields.map(f => ({ name: f.name, type: f.type }))
sampleRows: suggestion.sampleRows || []
})) }))
} catch (err) { } catch (err) {
setCreateError(err.message) setCreateError(err.message)
@ -174,25 +125,19 @@ export default function Sources({ source, sources, setSources, setSource }) {
async function handleCreate(e) { async function handleCreate(e) {
e.preventDefault() e.preventDefault()
setCreateError('') setCreateError('')
const constraintArr = form.constraint_fields.split(',').map(s => s.trim()).filter(Boolean) const dedupArr = form.dedup_fields.split(',').map(s => s.trim()).filter(Boolean)
if (!form.name || constraintArr.length === 0) { if (!form.name || dedupArr.length === 0) {
setCreateError('Name and at least one constraint field required') setCreateError('Name and at least one dedup field required')
return return
} }
setCreateLoading(true) setCreateLoading(true)
try { try {
const config = form.schema.length > 0 ? { fields: form.schema } : {} const config = form.schema.length > 0 ? { fields: form.schema } : {}
await api.createSource({ name: form.name, constraint_fields: constraintArr, config, global_picklist: form.global_picklist !== false }) await api.createSource({ name: form.name, dedup_fields: dedupArr, config })
if (form.schema.length > 0) {
await api.generateView(form.name)
}
if (form.importSample && fileRef.current?.files[0]) {
await api.importCSV(form.name, fileRef.current.files[0])
}
const updated = await api.getSources() const updated = await api.getSources()
setSources(updated) setSources(updated)
setSource(form.name) setSource(form.name)
setForm({ name: '', constraint_fields: '', fields: [], schema: [], importSample: true }) setForm({ name: '', dedup_fields: '', fields: [], schema: [] })
setCreating(false) setCreating(false)
} catch (err) { } catch (err) {
setCreateError(err.message) setCreateError(err.message)
@ -202,7 +147,7 @@ export default function Sources({ source, sources, setSources, setSource }) {
} }
return ( return (
<div className="p-6 max-w-5xl"> <div className="p-6 max-w-2xl">
<div className="flex items-center justify-between mb-6"> <div className="flex items-center justify-between mb-6">
<h1 className="text-xl font-semibold text-gray-800"> <h1 className="text-xl font-semibold text-gray-800">
{sourceObj ? sourceObj.name : 'Sources'} {sourceObj ? sourceObj.name : 'Sources'}
@ -238,45 +183,18 @@ export default function Sources({ source, sources, setSources, setSource }) {
<table className="w-full text-xs"> <table className="w-full text-xs">
<thead> <thead>
<tr className="text-left text-gray-400 border-b border-gray-100"> <tr className="text-left text-gray-400 border-b border-gray-100">
{[ <th className="pb-1 font-medium">Key</th>
{ col: 'key', label: 'Key' }, <th className="pb-1 font-medium">Origin</th>
{ col: 'origin', label: 'Origin' }, <th className="pb-1 font-medium">Type</th>
{ col: 'type', label: 'Type' }, <th className="pb-1 font-medium text-center">Dedup</th>
{ col: 'constraint', label: 'Constraint', center: true }, <th className="pb-1 font-medium text-center">In view</th>
{ col: 'inview', label: 'In view', center: true }, <th className="pb-1 font-medium text-center">Seq</th>
{ col: 'seq', label: 'Seq', center: true },
].map(({ col, label, center }) => (
<th
key={col}
onClick={() => setFieldSort(s => ({ col, dir: s.col === col && s.dir === 'asc' ? 'desc' : 'asc' }))}
className={`pb-1 font-medium cursor-pointer select-none hover:text-gray-600 ${center ? 'text-center' : ''}`}
>
{label}
<span className="ml-1 text-gray-300">
{fieldSort.col === col ? (fieldSort.dir === 'asc' ? '▲' : '▼') : '⇅'}
</span>
</th>
))}
</tr> </tr>
</thead> </thead>
<tbody> <tbody>
{[...availableFields].sort((a, b) => { {availableFields.map(f => {
const constraintList = constraintFields.split(',').map(s => s.trim())
const aSchema = schemaFields.find(sf => sf.name === a.key)
const bSchema = schemaFields.find(sf => sf.name === b.key)
let av, bv
if (fieldSort.col === 'key') { av = a.key; bv = b.key }
else if (fieldSort.col === 'origin') { av = a.origins.join(','); bv = b.origins.join(',') }
else if (fieldSort.col === 'type') { av = aSchema?.type || ''; bv = bSchema?.type || '' }
else if (fieldSort.col === 'constraint') { av = constraintList.includes(a.key) ? 0 : 1; bv = constraintList.includes(b.key) ? 0 : 1 }
else if (fieldSort.col === 'inview') { av = aSchema ? 0 : 1; bv = bSchema ? 0 : 1 }
else if (fieldSort.col === 'seq') { av = aSchema?.seq ?? 999; bv = bSchema?.seq ?? 999 }
if (av < bv) return fieldSort.dir === 'asc' ? -1 : 1
if (av > bv) return fieldSort.dir === 'asc' ? 1 : -1
return 0
}).map(f => {
const isRaw = f.origins.includes('raw') const isRaw = f.origins.includes('raw')
const constraintChecked = constraintFields.split(',').map(s => s.trim()).includes(f.key) const dedupChecked = dedup.split(',').map(s => s.trim()).includes(f.key)
const schemaEntry = schemaFields.find(sf => sf.name === f.key) const schemaEntry = schemaFields.find(sf => sf.name === f.key)
const inView = !!schemaEntry const inView = !!schemaEntry
return ( return (
@ -310,13 +228,13 @@ export default function Sources({ source, sources, setSources, setSource }) {
{isRaw && ( {isRaw && (
<input <input
type="checkbox" type="checkbox"
checked={constraintChecked} checked={dedupChecked}
onChange={e => { onChange={e => {
const current = constraintFields.split(',').map(s => s.trim()).filter(Boolean) const current = dedup.split(',').map(s => s.trim()).filter(Boolean)
const next = e.target.checked const next = e.target.checked
? [...current, f.key] ? [...current, f.key]
: current.filter(k => k !== f.key) : current.filter(k => k !== f.key)
setConstraintFields(next.join(', ')) setDedup(next.join(', '))
}} }}
/> />
)} )}
@ -355,11 +273,7 @@ export default function Sources({ source, sources, setSources, setSource }) {
</tbody> </tbody>
</table> </table>
<div className="flex items-center gap-3 pt-1 flex-wrap"> <div className="flex items-center gap-3 pt-1">
<label className="flex items-center gap-1.5 text-xs text-gray-500 cursor-pointer">
<input type="checkbox" checked={globalPicklist} onChange={e => setGlobalPicklist(e.target.checked)} />
Global picklist
</label>
<form onSubmit={handleSave}> <form onSubmit={handleSave}>
<button type="submit" disabled={saving} <button type="submit" disabled={saving}
className="text-sm bg-blue-600 text-white px-3 py-1.5 rounded hover:bg-blue-700 disabled:opacity-50"> className="text-sm bg-blue-600 text-white px-3 py-1.5 rounded hover:bg-blue-700 disabled:opacity-50">
@ -381,24 +295,17 @@ export default function Sources({ source, sources, setSources, setSource }) {
</> </>
)} )}
</div> </div>
<SampleTable rows={sampleRows} />
</div> </div>
)} )}
{/* Save button when no fields loaded yet */} {/* Save button when no fields loaded yet */}
{availableFields.length === 0 && ( {availableFields.length === 0 && (
<div className="flex items-center gap-3"> <form onSubmit={handleSave}>
<label className="flex items-center gap-1.5 text-xs text-gray-500 cursor-pointer"> <button type="submit" disabled={saving}
<input type="checkbox" checked={globalPicklist} onChange={e => setGlobalPicklist(e.target.checked)} /> className="text-sm bg-blue-600 text-white px-3 py-1.5 rounded hover:bg-blue-700 disabled:opacity-50">
Global picklist {saving ? 'Saving…' : 'Save'}
</label> </button>
<form onSubmit={handleSave}> </form>
<button type="submit" disabled={saving}
className="text-sm bg-blue-600 text-white px-3 py-1.5 rounded hover:bg-blue-700 disabled:opacity-50">
{saving ? 'Saving…' : 'Save'}
</button>
</form>
</div>
)} )}
{/* Reprocess */} {/* Reprocess */}
@ -430,14 +337,8 @@ export default function Sources({ source, sources, setSources, setSource }) {
<h2 className="text-sm font-semibold text-gray-700 mb-3">New source</h2> <h2 className="text-sm font-semibold text-gray-700 mb-3">New source</h2>
<div className="mb-4"> <div className="mb-4">
<input type="file" accept=".csv" ref={fileRef} onChange={handleSuggest} className="hidden" /> <label className="text-xs text-gray-500 block mb-1">Upload a CSV to auto-detect fields</label>
<button <input type="file" accept=".csv" ref={fileRef} onChange={handleSuggest} className="text-sm text-gray-600" />
type="button"
onClick={() => fileRef.current?.click()}
className="text-sm border border-gray-300 rounded px-3 py-1.5 text-gray-600 hover:bg-gray-50 hover:border-gray-400"
>
{csvFileName || 'Choose CSV…'}
</button>
</div> </div>
<form onSubmit={handleCreate} className="space-y-3"> <form onSubmit={handleCreate} className="space-y-3">
@ -452,123 +353,53 @@ export default function Sources({ source, sources, setSources, setSource }) {
</div> </div>
{form.fields.length > 0 && ( {form.fields.length > 0 && (
<div className="pt-2 border-t border-gray-100 space-y-2"> <div>
<label className="text-xs text-gray-500 block mb-1">Detected fields check to use as dedup keys</label>
<table className="w-full text-xs"> <table className="w-full text-xs">
<thead> <thead>
<tr className="text-left text-gray-400 border-b border-gray-100"> <tr className="text-left text-gray-400 border-b border-gray-100">
<th className="pb-1 font-medium">Key</th> <th className="pb-1 font-medium">Field</th>
<th className="pb-1 font-medium">Type</th> <th className="pb-1 font-medium">Type</th>
<th className="pb-1 font-medium text-center">Constraint</th> <th className="pb-1 font-medium text-center">Dedup</th>
<th className="pb-1 font-medium text-center">In view</th>
<th className="pb-1 font-medium text-center">Seq</th>
</tr> </tr>
</thead> </thead>
<tbody> <tbody>
{form.fields.map(f => { {form.fields.map(f => (
const schemaEntry = form.schema.find(s => s.name === f.name) <tr key={f.name} className="border-t border-gray-50">
const inView = !!schemaEntry <td className="py-1 font-mono text-gray-700">{f.name}</td>
const currentType = schemaEntry?.type || f.type <td className="py-1 text-gray-400">{f.type}</td>
return ( <td className="py-1 text-center">
<tr key={f.name} className="border-t border-gray-50"> <input
<td className="py-1 font-mono text-gray-700">{f.name}</td> type="checkbox"
<td className="py-1"> checked={form.dedup_fields.split(',').map(s => s.trim()).includes(f.name)}
{inView && ( onChange={e => {
<select const current = form.dedup_fields.split(',').map(s => s.trim()).filter(Boolean)
className="border border-gray-200 rounded px-1 py-0.5 text-xs focus:outline-none focus:border-blue-400" const next = e.target.checked
value={currentType} ? [...current, f.name]
onChange={e => setForm(ff => ({ : current.filter(n => n !== f.name)
...ff, setForm(ff => ({ ...ff, dedup_fields: next.join(', ') }))
schema: ff.schema.map(s => s.name === f.name ? { ...s, type: e.target.value } : s) }}
}))} />
> </td>
{FIELD_TYPES.map(t => <option key={t} value={t}>{t}</option>)} </tr>
</select> ))}
)}
</td>
<td className="py-1 text-center">
<input
type="checkbox"
checked={form.constraint_fields.split(',').map(s => s.trim()).includes(f.name)}
onChange={e => {
const current = form.constraint_fields.split(',').map(s => s.trim()).filter(Boolean)
const next = e.target.checked
? [...current, f.name]
: current.filter(n => n !== f.name)
setForm(ff => ({ ...ff, constraint_fields: next.join(', ') }))
}}
/>
</td>
<td className="py-1 text-center">
<input
type="checkbox"
checked={inView}
onChange={e => {
if (e.target.checked) {
const nextSeq = form.schema.length > 0
? Math.max(...form.schema.map(s => s.seq ?? 0)) + 1
: 1
setForm(ff => ({ ...ff, schema: [...ff.schema, { name: f.name, type: f.type, seq: nextSeq }] }))
} else {
setForm(ff => ({ ...ff, schema: ff.schema.filter(s => s.name !== f.name) }))
}
}}
/>
</td>
<td className="py-1 text-center">
{inView && (
<input
type="number"
className="w-12 border border-gray-200 rounded px-1 py-0.5 text-xs text-center focus:outline-none focus:border-blue-400"
value={schemaEntry.seq ?? ''}
onChange={e => setForm(ff => ({
...ff,
schema: ff.schema.map(s => s.name === f.name ? { ...s, seq: parseInt(e.target.value) || 0 } : s)
}))}
/>
)}
</td>
</tr>
)
})}
</tbody> </tbody>
</table> </table>
<SampleTable rows={form.sampleRows || []} />
</div> </div>
)} )}
{form.fields.length === 0 && ( {form.fields.length === 0 && (
<div> <div>
<label className="text-xs text-gray-500 block mb-1">Constraint fields (comma-separated)</label> <label className="text-xs text-gray-500 block mb-1">Dedup fields (comma-separated)</label>
<input <input
className="w-full border border-gray-200 rounded px-3 py-1.5 text-sm focus:outline-none focus:border-blue-400" className="w-full border border-gray-200 rounded px-3 py-1.5 text-sm focus:outline-none focus:border-blue-400"
value={form.constraint_fields} value={form.dedup_fields}
onChange={e => setForm(f => ({ ...f, constraint_fields: e.target.value }))} onChange={e => setForm(f => ({ ...f, dedup_fields: e.target.value }))}
placeholder="e.g. date, amount, description" placeholder="e.g. date, amount, description"
/> />
</div> </div>
)} )}
<div className="flex gap-4">
<label className="flex items-center gap-1.5 text-xs text-gray-500 cursor-pointer">
<input
type="checkbox"
checked={form.global_picklist !== false}
onChange={e => setForm(f => ({ ...f, global_picklist: e.target.checked }))}
/>
Global picklist
</label>
{form.fields.length > 0 && (
<label className="flex items-center gap-1.5 text-xs text-gray-500 cursor-pointer">
<input
type="checkbox"
checked={form.importSample !== false}
onChange={e => setForm(f => ({ ...f, importSample: e.target.checked }))}
/>
Import sample data
</label>
)}
</div>
{createError && <p className="text-xs text-red-500">{createError}</p>} {createError && <p className="text-xs text-red-500">{createError}</p>}
<div className="flex gap-2"> <div className="flex gap-2">
@ -577,7 +408,7 @@ export default function Sources({ source, sources, setSources, setSource }) {
{createLoading ? 'Creating…' : 'Create'} {createLoading ? 'Creating…' : 'Create'}
</button> </button>
<button type="button" <button type="button"
onClick={() => { setCreating(false); setCreateError(''); setForm({ name: '', constraint_fields: '', fields: [], schema: [] }) }} onClick={() => { setCreating(false); setCreateError(''); setForm({ name: '', dedup_fields: '', fields: [], schema: [] }) }}
className="text-sm text-gray-500 px-3 py-1.5 rounded hover:bg-gray-100"> className="text-sm text-gray-500 px-3 py-1.5 rounded hover:bg-gray-100">
Cancel Cancel
</button> </button>