Commit Graph

3458 Commits

Author SHA1 Message Date
John Bodley
f102eab33c
[crud] Improving performance (#5136) 2018-06-05 17:24:19 -07:00
Xiao Hanyu
b71f551493 Optimize presto SQL Lab query performance. (#5132)
By stop polling when presto query already finished.

When user make queries to Presto via SQL Lab, presto will run the query
and then it can return all data back to superset in one shot.

However, the default implementation of superset has enabled a default
polling for presto to:

- Get the fancy progress bar
- Get the data back when the query finished.

However, the polling implementation of superset is not right.

I've done a profiling with a table of 1 billion rows, here're some data:

- Total number of rows: 1.02 Billion
- SQL Lab query limit: 1 million
- Output Data: 1.5 GB
- Superset memory consumed: about 10-20 GB
- Time: 7 minutes to finish in Presto, takes additional 15 minutes for
  superset to get and store data.

The problems with default issue is, even if presto has finished the
query (7 minutes with above profiling), superset still do lots of wasted
polling, in above profiling, superset sent about 540 polling in total,
and at half of the polling is not necessary.

Part of the simplied polling response:

```
{
  "infoUri": "http://10.65.204.39:8000/query.html?20180525_042715_03742_nza9u",
  "id": "20180525_042715_03742_nza9u",
  "nextUri": "http://10.65.204.39:8000/v1/statement/20180525_042715_03742_nza9u/11",
  "stats": {
    "state": "FINISHED",
    "queuedSplits": 21701,
    "progressPercentage": 35.98264191882267,
    "elapsedTimeMillis": 1029,
    "nodes": 116,
    "completedSplits": 15257,
    "scheduled": true,
    "wallTimeMillis": 2571904,
    "peakMemoryBytes": 0,
    "processedBytes": 40825519532,
    "processedRows": 47734066,
    "queuedTimeMillis": 0,
    "queued": false,
    "cpuTimeMillis": 849228,
    "rootStage": {
      "state": "FINISHED",
      "queuedSplits": 0,
      "nodes": 1,
      "totalSplits": 17,
      "processedBytes": 16829644,
      "processedRows": 11495,
      "completedSplits": 17,
      "stageId": "0",
      "done": true,
      "cpuTimeMillis": 69,
      "subStages": [
        {
          "state": "CANCELED",
          "queuedSplits": 21701,
          "nodes": 116,
          "totalSplits": 42384,
          "processedBytes": 40825519532,
          "processedRows": 47734066,
          "completedSplits": 15240,
          "stageId": "1",
          "done": true,
          "cpuTimeMillis": 849159,
          "subStages": [],
          "wallTimeMillis": 2570374,
          "userTimeMillis": 730020,
          "runningSplits": 5443
        }
      ],
      "wallTimeMillis": 1530,
      "userTimeMillis": 50,
      "runningSplits": 0
    },
    "totalSplits": 42401,
    "userTimeMillis": 730070,
    "runningSplits": 5443
  }
  }
}
```

Superset will terminate the polling when it finds that `nextUri`
becomes none, but actually, when `["stats"]["state"] == "FINISHED"`,
it means that presto has already finished the query and superset can stop
polling and get the data back.

After this simple optimization, we get a 2-5x performance boost for
Presto SQL Lab queries.
2018-06-05 08:56:18 -07:00
Maxime Beauchemin
d2bc4ece3e
Bump celery to 4.1.1 (#5134)
* Bump celery to 4.1.1

Docs reference `celery worker --app=superset.sql_lab:celery_app
--pool=gevent -Ofair` command which seems only to work with Celery 4.1.1

* Add UPDATING.md message
2018-06-04 14:54:36 -07:00
Maxime Beauchemin
ffd65ce623
Pin FAB to 1.10.0 (#5133)
Related to
https://github.com/apache/incubator-superset/issues/5088#issuecomment-394064133
2018-06-04 09:03:30 -07:00
Tamika Tannis
dc21e0dd78 URL shortner for dashboards (#4760)
* Added support for URLShortLinkButton to work for the dashboard case

* Fix lint errors and test

* Change references to 'slice' to 'chart'.

* Add unit tests to improve coverage

* Fixing lint errors

* Refactor to make URLShortLink more generic. Remove history modification code, redirect should be handling this.

* Remove history modification code, redirect should be handling this

* Generate a shorter link without the directory, and delegate default linked to the contents of window.location

* Fix lint errors

* Fix test_shortner test to check for new pattern

* Remove usage of addHistory to manipulate explore shortlink redirection

* Address build failure and using better practices for shortlink defaults

* Fixing alphabetical order

* More syntax mistakes

* Revert explore view history changes

* Fix use of component props, & rebase
2018-06-02 11:08:43 -07:00
Michelle Thomas
47768284d0 Adding tests for adhoc metric as timeseries_limit_metric 2018-05-31 23:48:12 -07:00
Gabe Lyons
cc0942ac98 updating adhoc metric filtering (#5105) 2018-05-31 23:34:48 -07:00
Ky-Anh Huynh
556ef44fac docs: Add new Athena URI scheme awsathena+rest:// (#5112)
See also some discussions on https://github.com/laughingman7743/PyAthenaJDBC/pull/62
2018-05-31 22:19:07 -07:00
Beto Dealmeida
1d3e96bce0 Allow multiple time shifts (#5067)
* Allow multiple time shifts

* Handle old form data
2018-05-31 21:18:36 -07:00
Gabe Lyons
40fadfcb4f adding null checks to adhoc filter popover (#5111) 2018-05-31 21:00:20 -07:00
Maxime Beauchemin
28611108d7
Refactor NULL handling into method, disable for DECK.gl vizes (#5106) 2018-05-31 16:01:49 -07:00
michellethomas
ff4b103025 Fixing time table viz for adhoc metrics (#5117) 2018-05-31 13:53:26 -07:00
Michelle Thomas
6f05b48385 Adding the MetricsControl to the timeseries_limit_metric field 2018-05-31 12:52:00 -07:00
Maxime Beauchemin
4ecd95a318
[bugfix] deck.gl on druid always shows animation (#5107) 2018-05-31 11:57:53 -07:00
Gabe Lyons
f3778c3c81 fixing LIKE constant name (#5110) 2018-05-31 11:34:51 -07:00
timifasubaa
cefc206a36
Merge pull request #5023 from timifasubaa/fix_sqllab_commit
[sqllab] force limit queries only when there is no existing limit
2018-05-31 11:12:46 -07:00
Beto Dealmeida
875d0b5ad2 Override time grain in annotations (#5084) 2018-05-30 15:29:48 -07:00
timifasubaa
e8b25988e2
Merge pull request #5109 from cxmcc/patch-1
Add Lime to Superset user list.
2018-05-30 15:17:16 -07:00
Xiuming Chen
21967f40e7
Add Lime to Superset user list.
Add Lime to Superset user list.
2018-05-30 14:24:03 -07:00
Timi Fasubaa
a9d7fafd9f add tests 2018-05-30 12:50:27 -07:00
Maxime Beauchemin
f6117973e9
Bump dep on pydruid to 0.4.3 (#5098) 2018-05-30 09:15:10 -07:00
John Bodley
0511d1f38d
[get_df] Adding support for multi-statement SQL (#5086) 2018-05-29 14:20:17 -07:00
谢邵虎
7dbb45e5fb add CnOvit to Superset users list (#5094) 2018-05-29 13:59:49 -07:00
Beto Dealmeida
6c3e469154 Add more time grains (#5083)
* Add more time grains

* Use FLOOR

* Fix quotes for lint
2018-05-29 12:43:48 -07:00
Maciej Bryński
ae50845843 Proper error handling in Hive Queries (#4428)
* Proper error handling in Hive Queries

* Change quotes

* Trigger checks

* Adding call to parent class

* Small fix

* Fix in method call
2018-05-29 12:42:45 -07:00
zjj
459267785f Fix python2 str() in visualization (#5093) 2018-05-29 10:33:22 -07:00
Timi Fasubaa
d38315a307 reuse_regex_logic 2018-05-25 15:07:27 -07:00
Timi Fasubaa
1aced9b562 force limit only when there is no existing limit 2018-05-25 14:54:11 -07:00
Yongjie Zhao
c18ef89034 [bugfix] fix visualization with adhocMetric (#5080)
* fix visualization with adhocMetric

* update
2018-05-25 09:48:18 -07:00
Alexander Ko
e30215c3d8 Add 24 hours refresh for dashboard (#5068)
* adding 24 hours refresh

* adding additional hours
2018-05-24 17:44:24 -07:00
Maxime Beauchemin
42d0597b90
Use a dummy version number on master (#5000)
Currently we assign release version number in release branches and
master was still pointing to some old version number from when the
process was different. We need a dummy version number that both setuptools
and npm are ok with.
2018-05-24 17:42:46 -07:00
John Bodley
3207116535
Revert "[get_df] Adding support for multi-statement SQL" (#5078) 2018-05-24 14:59:34 -07:00
michellethomas
1aaa73b548 Translate string to array for multi fields in getControlsState (#5057)
* Translate string to array for multi fields in getControlsState

* Updating format to fit on one line
2018-05-23 22:30:44 -07:00
Maxime Beauchemin
05061a73ce
Fix time shift color assignements (#5065)
closes https://github.com/apache/incubator-superset/pull/4765
2018-05-23 21:30:03 -07:00
John Bodley
d322e48c57
[markup] Enable allow-forms (#5062) 2018-05-23 13:30:02 -07:00
Gabe Lyons
fa3e4e23b3 integrating dashboard filters with adhoc filters (#5056) 2018-05-23 11:46:00 -07:00
John Bodley
17d6464aa9
[get_df] Adding support for multi-statement SQL (#5060) 2018-05-23 11:40:25 -07:00
Grace Guo
4c44223234
[Dashboard] Allow Superset Alpha, Gamma users to save dashboard as a copy (#5051) 2018-05-22 15:31:37 -07:00
michellethomas
b8aeb1a825 Allow MetricsControl to aggregate on a column with an expression (#5021)
* Allow MetricsControl to aggregate on a column with an expression

* Adding test case for metrics based on columns
2018-05-22 09:58:38 -07:00
Hua Jigang
b312cdad2f fix metrics type error in pivot table viz (#5025)
transfer metrics dict label to list of string
2018-05-21 21:13:10 -07:00
Beto Dealmeida
973c661501 Rename "slice" to "chart" and update translations (#5008)
* Rename slice to chart and update translations

* Fix unit tests
2018-05-21 17:49:02 -07:00
Beto Dealmeida
459cb701fb Visualization for multiple line charts (#4819)
* Initial test

* Save

* Working version

* Use since/until from payload

* Option to prefix metric name

* Rename LineMultiLayer to MultiLineViz

* Add more styles

* Explicit nulls

* Add more x controls

* Refactor to reuse nvd3_vis

* Fix x ticks

* Fix spacing

* Fix for druid datasource

* Rename file

* Small fixes and cleanup

* Fix margins

* Add proper thumbnails

* Align yaxis1 and yaxis2 ticks

* Improve code

* Trigger tests

* Move file

* Small fixes plus example

* Fix unit test

* Remove SQL and Filter sections
2018-05-21 17:47:21 -07:00
Gabe Lyons
a746fce383 expanding simple tab (#5032) 2018-05-21 16:22:40 -07:00
Gabe Lyons
0e1fb62db2 forcing ace editor to refresh when it is shown (#5038) 2018-05-21 16:20:46 -07:00
Maxime Beauchemin
ce0011e5fc
Add missing dep on contextlib2 (#5027) 2018-05-21 13:19:07 -07:00
Gabe Lyons
1c9474b4ff treating floats like doubles for druid versions lower than 11.0.0 (#5030) 2018-05-21 11:50:04 -07:00
Yongjie Zhao
9f66dae328 [bugfix] Fix ZeroDivisionError and get metrics label with percent metrics (#5026)
* Fix percent_metrics ZeroDivisionError and can not get metrics with label issue

* convert iterator to list

* get percentage metrics with get_metric_label method

* Adding tests case for expression type metrics

* Simplify expression
2018-05-20 11:10:57 -05:00
timifasubaa
5505c116ba
Merge pull request #5019 from timifasubaa/fix_error_message_for_missing_datasource
fix missing datasource error message
2018-05-18 01:13:47 -07:00
timifasubaa
63115fbb87
nit 2018-05-17 23:34:12 -07:00
Timi Fasubaa
f52f7aa7cf raise exception early 2018-05-17 17:39:59 -07:00