Go to file
Paul Trowbridge 545ba4e2b5 merge wk
2018-02-05 22:54:26 -05:00
.gitignore add ignore 2018-02-02 09:37:35 -05:00
build_json.xlsx update readme 2018-02-02 17:34:10 -05:00
col_balance.pgsql added partion by major item, and then sort on import id 2017-10-25 10:45:24 -04:00
dcard_bal.pgsql add running discover card balance 2017-10-26 00:16:46 -04:00
do_map_g_option.pgsql merge wk branch concept of placing the mapping instructions in the regex items themselves 2017-11-04 09:33:16 -04:00
do_map.pgsql merge wk 2017-10-26 18:46:42 -04:00
event_log.md expirement with different formats and add a receipt 2017-09-20 00:12:37 -04:00
event_log.pgsql link in header 2017-08-24 23:46:15 -04:00
evt_log_gl_extract.pgsql build interpretation of evt.log for running totals 2017-10-27 01:44:41 -04:00
header_item_template.pgsql add function to build a basic entry with an offset for each item of a header 2017-11-02 23:51:04 -04:00
LICENSE add event record example 2017-07-24 23:13:34 -04:00
list_maps.pgsql add some info queries - listing of maps, and items not mapped 2017-11-04 09:52:40 -04:00
loan_bal.pgsql add query to give items that have no map but are supposed to be mapped 2017-11-02 15:07:47 -04:00
log_readme.md thought on evt.log 2017-10-30 17:38:50 -04:00
log.md add excel file with json build macro 2017-09-18 21:07:22 -04:00
map_rm_template.pgsql merge wk branch concept of placing the mapping instructions in the regex items themselves 2017-11-04 09:33:16 -04:00
map_rm.pgsql update maps for first 20, get rid of pretty and tag on summary of mappings 2017-10-19 23:57:03 -04:00
map_rv_items_not_mapped.pgsql add some info queries - listing of maps, and items not mapped 2017-11-04 09:52:40 -04:00
readme.md update readme 2018-02-02 17:34:10 -05:00
rebuild_pg.cmd add batch file to re-create database 2017-10-13 03:23:41 -04:00
rec.json updates 2017-08-24 22:57:21 -04:00
sqitch.conf add sqitch 2017-10-25 01:35:21 -04:00
sqitch.plan add sqitch 2017-10-25 01:35:21 -04:00
srce_defn.pgsql try using correlated subquery for unique list of keys, is pretty slow 2017-10-25 12:05:28 -04:00
srce_template.pgsql add query to give items that have no map but are supposed to be mapped 2017-11-02 15:07:47 -04:00
srce_unq.pgsql rename fle to srce_unq 2017-10-25 00:57:41 -04:00
srce.pgsql build template for applying gl on the server side per item with offset 2017-11-01 01:12:33 -04:00
summary.xlsx build template for applying gl on the server side per item with offset 2017-11-01 01:12:33 -04:00
trans_log_template.pgsql catch up with wk branch 2017-10-25 00:42:22 -04:00
transaction_range.pgsql merge wk 2018-02-05 22:54:26 -05:00
ubm_backup.cmd dont export tps.trans table 2017-10-19 09:24:55 -04:00
ubm_data.sql merge wk 2018-02-05 22:54:26 -05:00
ubm_schema.sql merge wk 2018-02-05 22:54:26 -05:00

Overview

                        +--------------+
                        |csv data      |
                        +-----+--------+
                              |
                              |
                              v
+----web ui----+        +----func+----+            +---table----+
|import screen +------> |srce.sql     +----------> |tps.srce    | <-------------------+
+--------------+        +-------------+            +------------+                     |
                        |p1:srce      |                                               |
                        |p2:file path |                                               |
+-----web ui---+        +-------------+            +----table---+                     |
|create map    |                                   |tps.map_rm  |                  +--+--db proc-----+
|profile       +---------------------------------> |            |                  |update tps.trans |
+------+-------+                                   +-----+------+                  |column allj to   |
       |                                                 ^                         |contain map data |
       |                                                 |                         +--+--------------+
       v                                                foreign key                   ^
+----web ui+----+                                        |                            |
|assign maps    |                                        +                            |
|for return     |                                  +---table----+                     |
+values         +--------------------------------> |tps.map_rv  |                     |
+---------------+                                  |            +---------------------+
                                                   +------------+

The goal is to:

  1. house external data and prevent duplication on insert
  2. apply mappings to the data to make it meaningful
  3. be able to reference it from outside sources (no action required)

There are 5 tables

  • tps.srce : definition of source
  • tps.trans : actual data
  • tps.trans_log : log of inserts
  • tps.map_rm : map profile
  • tps.map_rv : profile associated values

tps.srce schema

{
    "name": "WMPD",
    "descr": "Williams Paid File",
    "type":"csv",
    "schema": [
        {
            "key": "Carrier",
            "type": "text"
        },
        {
            "key": "Pd Amt",
            "type": "numeric"
        },
        {
            "key": "Pay Dt",
            "type": "date"
        }
    ],
    "unique_constraint": {
        "fields":[
            "{Pay Dt}",
            "{Carrier}" 
        ]
    }
}

tps.map_rm schema

{
    "name":"Strip Amount Commas",
    "description":"the Amount field comes from PNC with commas embeded so it cannot be cast to numeric",
    "defn": [
        {
            "key": "{Amount}",        /*this is a Postgres text array stored in json*/
            "field": "amount",        /*key name assigned to result of regex/* 
            "regex": ",",             /*regular expression/*
            "flag":"g",
            "retain":"y",
            "map":"n"
        }
    ],
    "function":"replace",
    "where": [
        {
        }
    ]
}

Notes

pull various static files into postgres and do basic transformation without losing the original document or getting into custom code for each scenario

the is an in-between for an foreign data wrapper & custom programming

Storage

all records are jsonb applied mappings are in associated jsonb documents

Import

COPY function utilized

Mappings

  1. regular expressions are used to extract pieces of the json objects
  2. the results of the regular expressions are bumped up against a list of basic mappings and written to an associated jsonb document

each regex expression within a targeted pattern can be set to map or not. then the mapping items should be joined to map_rv with an = as opposed to @> to avoid duplication of rows

Transformation tools

  • COPY
  • regexp_matches()

Difficulties

Non standard file formats will require additional logic example: PNC loan balance and collateral CSV files

  1. External: Anything not in CSV should be converted external to Postgres and then imported as CSV
  2. Direct: Outside logic can be setup to push new records to tps.trans direct from non-csv fornmated sources or fdw sources

Interface

maybe start out in excel until it gets firmed up

  • list existing mappings
    • apply mappings to see what results come back
  • experiment with new mappings