.gitignore | ||
col_balance.pgsql | ||
dcard_bal.pgsql | ||
do_map_g_option.pgsql | ||
do_map.pgsql | ||
LICENSE | ||
list_maps.pgsql | ||
loan_bal.pgsql | ||
log.md | ||
map_rm_template.pgsql | ||
map_rm.pgsql | ||
map_rv_items_not_mapped.pgsql | ||
readme.md | ||
rebuild_pg.cmd | ||
srce_defn.pgsql | ||
srce_template.pgsql | ||
srce_unq.pgsql | ||
srce.pgsql | ||
transaction_range.pgsql | ||
ubm_backup.cmd | ||
ubm_schema.sql |
Overview
+--------------+
|csv data |
+-----+--------+
|
|
v
+----web ui----+ +----func+----+ +---table----+
|import screen +------> |srce.sql +----------> |tps.srce | <-------------------+
+--------------+ +-------------+ +------------+ |
|p1:srce | |
|p2:file path | |
+-----web ui---+ +-------------+ +----table---+ |
|create map | |tps.map_rm | +--+--db proc-----+
|profile +---------------------------------> | | |update tps.trans |
+------+-------+ +-----+------+ |column allj to |
| ^ |contain map data |
| | +--+--------------+
v foreign key ^
+----web ui+----+ | |
|assign maps | + |
|for return | +---table----+ |
+values +--------------------------------> |tps.map_rv | |
+---------------+ | +---------------------+
+------------+
The goal is to:
- house external data and prevent duplication on insert
- apply mappings to the data to make it meaningful
- be able to reference it from outside sources (no action required)
There are 5 tables
- tps.srce : definition of source
- tps.trans : actual data
- tps.trans_log : log of inserts
- tps.map_rm : map profile
- tps.map_rv : profile associated values
tps.srce schema
{
"name": "WMPD",
"descr": "Williams Paid File",
"type":"csv",
"schema": [
{
"key": "Carrier",
"type": "text"
},
{
"key": "Pd Amt",
"type": "numeric"
},
{
"key": "Pay Dt",
"type": "date"
}
],
"unique_constraint": {
"fields":[
"{Pay Dt}",
"{Carrier}"
]
}
}
tps.map_rm schema
{
"name":"Strip Amount Commas",
"description":"the Amount field comes from PNC with commas embeded so it cannot be cast to numeric",
"defn": [
{
"key": "{Amount}", /*this is a Postgres text array stored in json*/
"field": "amount", /*key name assigned to result of regex/*
"regex": ",", /*regular expression/*
"flag":"g",
"retain":"y",
"map":"n"
}
],
"function":"replace",
"where": [
{
}
]
}
Notes
pull various static files into postgres and do basic transformation without losing the original document or getting into custom code for each scenario
the is an in-between for an foreign data wrapper & custom programming
Storage
all records are jsonb applied mappings are in associated jsonb documents
Import
COPY
function utilized
Mappings
- regular expressions are used to extract pieces of the json objects
- the results of the regular expressions are bumped up against a list of basic mappings and written to an associated jsonb document
each regex expression within a targeted pattern can be set to map or not. then the mapping items should be joined to map_rv with an =
as opposed to @>
to avoid duplication of rows
Transformation tools
COPY
regexp_matches()
Difficulties
Non standard file formats will require additional logic example: PNC loan balance and collateral CSV files
- External: Anything not in CSV should be converted external to Postgres and then imported as CSV
- Direct: Outside logic can be setup to push new records to tps.trans direct from non-csv fornmated sources or fdw sources
Interface
maybe start out in excel until it gets firmed up
- list existing mappings
- apply mappings to see what results come back
- experiment with new mappings