* fetch datasources from broker endpoint when refresh new datasources
* remove get_base_coordinator_url as out of use
* add broker_endpoint in get_test_cluster_obj
This commit will try to dockerize superset in local development
environment.
The basic design is:
- Enable superset, redis and postgres service instead of using sqlite,
just want to simulate production environment settings
- Use environment variables to config various app settings. It's easy to
run and config superset to any environment if we use environment than
traditional config files
- For local development environment, we just expose postgres and redis
to local host machine thus you can connect local port via `psql` or
`redis-cli`
- Wrap start up command in a standard `docker-entrypoint.sh`, and use
`tail -f /dev/null` combined with manually `superset runserver -d` to
make sure that code error didn't cause the container to fail.
- Use volumes to share code between host and container, thus you can use
your favourite tools to modify code and your code will run in
containerized environment
- Use volumes to persistent postgres and redis data, and also
`node_modules` data.
- If we don't cache `node_modules` in docker volume, then every time
run docker build, the `node_modules` directory, will is about 500 MB
large, will be sent to docker daemon, and make the build quite slow.
- Wrap initialization commands to a single script `docker-init.sh`
After this dockerize setup, any developers who want to contribute to
superset, just follow three easy steps:
```
git clone https://github.com/apache/incubator-superset/
cd incubator-superset
cp contrib/docker/{docker-build.sh,docker-compose.yml,docker-entrypoint.sh,docker-init.sh,Dockerfile} .
cp contrib/docker/superset_config.py superset/
bash -x docker-build.sh
docker-compose up -d
docker-compose exec superset bash
bash docker-init.sh
```
* Improve time shift color and pattern
* Revert change
* Fix js unit test
* Move code to better place, add unit test
* Move classed code to backend
* Remove console.log
* Remove 1 hour time compare
* Remove unused import
By stop polling when presto query already finished.
When user make queries to Presto via SQL Lab, presto will run the query
and then it can return all data back to superset in one shot.
However, the default implementation of superset has enabled a default
polling for presto to:
- Get the fancy progress bar
- Get the data back when the query finished.
However, the polling implementation of superset is not right.
I've done a profiling with a table of 1 billion rows, here're some data:
- Total number of rows: 1.02 Billion
- SQL Lab query limit: 1 million
- Output Data: 1.5 GB
- Superset memory consumed: about 10-20 GB
- Time: 7 minutes to finish in Presto, takes additional 15 minutes for
superset to get and store data.
The problems with default issue is, even if presto has finished the
query (7 minutes with above profiling), superset still do lots of wasted
polling, in above profiling, superset sent about 540 polling in total,
and at half of the polling is not necessary.
Part of the simplied polling response:
```
{
"infoUri": "http://10.65.204.39:8000/query.html?20180525_042715_03742_nza9u",
"id": "20180525_042715_03742_nza9u",
"nextUri": "http://10.65.204.39:8000/v1/statement/20180525_042715_03742_nza9u/11",
"stats": {
"state": "FINISHED",
"queuedSplits": 21701,
"progressPercentage": 35.98264191882267,
"elapsedTimeMillis": 1029,
"nodes": 116,
"completedSplits": 15257,
"scheduled": true,
"wallTimeMillis": 2571904,
"peakMemoryBytes": 0,
"processedBytes": 40825519532,
"processedRows": 47734066,
"queuedTimeMillis": 0,
"queued": false,
"cpuTimeMillis": 849228,
"rootStage": {
"state": "FINISHED",
"queuedSplits": 0,
"nodes": 1,
"totalSplits": 17,
"processedBytes": 16829644,
"processedRows": 11495,
"completedSplits": 17,
"stageId": "0",
"done": true,
"cpuTimeMillis": 69,
"subStages": [
{
"state": "CANCELED",
"queuedSplits": 21701,
"nodes": 116,
"totalSplits": 42384,
"processedBytes": 40825519532,
"processedRows": 47734066,
"completedSplits": 15240,
"stageId": "1",
"done": true,
"cpuTimeMillis": 849159,
"subStages": [],
"wallTimeMillis": 2570374,
"userTimeMillis": 730020,
"runningSplits": 5443
}
],
"wallTimeMillis": 1530,
"userTimeMillis": 50,
"runningSplits": 0
},
"totalSplits": 42401,
"userTimeMillis": 730070,
"runningSplits": 5443
}
}
}
```
Superset will terminate the polling when it finds that `nextUri`
becomes none, but actually, when `["stats"]["state"] == "FINISHED"`,
it means that presto has already finished the query and superset can stop
polling and get the data back.
After this simple optimization, we get a 2-5x performance boost for
Presto SQL Lab queries.
* Bump celery to 4.1.1
Docs reference `celery worker --app=superset.sql_lab:celery_app
--pool=gevent -Ofair` command which seems only to work with Celery 4.1.1
* Add UPDATING.md message
* Added support for URLShortLinkButton to work for the dashboard case
* Fix lint errors and test
* Change references to 'slice' to 'chart'.
* Add unit tests to improve coverage
* Fixing lint errors
* Refactor to make URLShortLink more generic. Remove history modification code, redirect should be handling this.
* Remove history modification code, redirect should be handling this
* Generate a shorter link without the directory, and delegate default linked to the contents of window.location
* Fix lint errors
* Fix test_shortner test to check for new pattern
* Remove usage of addHistory to manipulate explore shortlink redirection
* Address build failure and using better practices for shortlink defaults
* Fixing alphabetical order
* More syntax mistakes
* Revert explore view history changes
* Fix use of component props, & rebase
Currently we assign release version number in release branches and
master was still pointing to some old version number from when the
process was different. We need a dummy version number that both setuptools
and npm are ok with.