Apache Superset is a Data Visualization and Data Exploration Platform
Go to file
Maxime Beauchemin b839608c32
[sql lab] a better approach at limiting queries (#4947)
* [sql lab] a better approach at limiting queries

Currently there are two mechanisms that we use to enforce the row
limiting constraints, depending on the database engine:
1. use dbapi's `cursor.fetchmany()`
2. wrap the SQL into a limiting subquery

Method 1 isn't great as it can result in the database server storing
larger than required result sets in memory expecting another fetch
command while we know we don't need that.

Method 2 has a positive side of working with all database engines,
whether they use LIMIT, ROWNUM, TOP or whatever else since sqlalchemy
does the work as specified for the dialect. On the downside though
the query optimizer might not be able to optimize this as much as an
approach that doesn't use a subquery.

Since most modern DBs use the LIMIT syntax, this adds a regex approach
to modify the query and force a LIMIT clause without using a subquery
for the database that support this syntax and uses method 2 for all
others.

* Fixing build

* Fix lint

* Added more tests

* Fix tests
2018-05-14 14:44:05 -05:00
docs [docs] add entry for Hive in installation.rst (#4942) 2018-05-07 15:26:19 -07:00
install/helm/superset Install superset in Kubernetes with helm chart (#4923) 2018-05-03 17:35:38 -07:00
scripts [flake8] Adding flake8-coding (#4477) 2018-02-25 15:06:11 -08:00
superset [sql lab] a better approach at limiting queries (#4947) 2018-05-14 14:44:05 -05:00
tests [sql lab] a better approach at limiting queries (#4947) 2018-05-14 14:44:05 -05:00
.gitignore [docs] add entry for Hive in installation.rst (#4942) 2018-05-07 15:26:19 -07:00
.pylintrc [sql lab] Use context manager for sqllab sessions (#4927) 2018-05-10 10:32:31 -07:00
.travis.yml [setup] Dropping 3.4 and adding 3.6 (#4835) 2018-04-17 21:30:12 -07:00
alembic.ini [WiP] rename project from Caravel to Superset (#1576) 2016-11-09 23:08:22 -08:00
CHANGELOG.md CHANGELOG for 0.25.0 (#4948) 2018-05-08 08:24:54 -07:00
CODE_OF_CONDUCT.md Create CODE_OF_CONDUCT.md (#3991) 2017-12-02 14:57:54 -08:00
CONTRIBUTING.md [docs] minor file name and format fix for the setup document (#4844) 2018-04-19 11:34:23 -07:00
gen_changelog.sh CHANGELOG for 0.20.0 (#3545) 2017-09-28 14:42:57 -07:00
ISSUE_TEMPLATE.md [WiP] rename project from Caravel to Superset (#1576) 2016-11-09 23:08:22 -08:00
LICENSE.txt LICENSE 2015-07-21 20:54:31 +00:00
MANIFEST.in Removing files from MANIFEST.in (#4542) 2018-03-06 09:39:31 -08:00
pypi_push.sh Fixing pypi_push.sh 2017-01-24 11:42:49 -08:00
README.md add Airboxlab to Superset users list (#4938) 2018-05-07 09:59:35 -07:00
requirements-dev.txt RFC: add logger that logs into browser console (#4702) 2018-04-12 21:48:17 -07:00
requirements.txt bump pyhive version 2018-05-10 11:13:44 -07:00
setup.cfg [travis/tox] Restructuring configuration (#4552) 2018-04-10 15:59:44 -07:00
setup.py [deps] force flask<=1.0.0 (#4959) 2018-05-13 11:16:09 -07:00
tox.ini [pylint] prepping for enabling pylint for non-errors (#4884) 2018-04-28 20:08:09 -07:00
UPDATING.md CHANGELOG for 0.25.0 (#4948) 2018-05-08 08:24:54 -07:00

Superset

Build Status PyPI version Coverage Status PyPI Join the chat at https://gitter.im/airbnb/superset Documentation dependencies Status

Superset

Apache Superset (incubating) is a modern, enterprise-ready business intelligence web application

[this project used to be named Caravel, and Panoramix in the past]

Screenshots & Gifs

View Dashboards


Slice & dice your data


Query and visualize your data with SQL Lab


Visualize geospatial data with deck.gl


Choose from a wide array of visualizations


Apache Superset

Apache Superset is a data exploration and visualization web application.

Superset provides:

  • An intuitive interface to explore and visualize datasets, and create interactive dashboards.
  • A wide array of beautiful visualizations to showcase your data.
  • Easy, code-free, user flows to drill down and slice and dice the data underlying exposed dashboards. The dashboards and charts acts as a starting point for deeper analysis.
  • A state of the art SQL editor/IDE exposing a rich metadata browser, and an easy workflow to create visualizations out of any result set.
  • An extensible, high granularity security model allowing intricate rules on who can access which product features and datasets. Integration with major authentication backends (database, OpenID, LDAP, OAuth, REMOTE_USER, ...)
  • A lightweight semantic layer, allowing to control how data sources are exposed to the user by defining dimensions and metrics
  • Out of the box support for most SQL-speaking databases
  • Deep integration with Druid allows for Superset to stay blazing fast while slicing and dicing large, realtime datasets
  • Fast loading dashboards with configurable caching

Database Support

Superset speaks many SQL dialects through SQLAlchemy, a Python ORM that is compatible with most common databases.

Superset can be used to visualize data out of most databases:

  • MySQL
  • Postgres
  • Vertica
  • Oracle
  • Microsoft SQL Server
  • SQLite
  • Greenplum
  • Firebird
  • MariaDB
  • Sybase
  • IBM DB2
  • Exasol
  • MonetDB
  • Snowflake
  • Redshift
  • more! look for the availability of a SQLAlchemy dialect for your database to find out whether it will work with Superset

Druid!

On top of having the ability to query your relational databases, Superset ships with deep integration with Druid (a real time distributed column-store). When querying Druid, Superset can query humongous amounts of data on top of real time dataset. Note that Superset does not require Druid in any way to function, it's simply another database backend that it can query.

Here's a description of Druid from the http://druid.io website:

Druid is an open-source analytics data store designed for business intelligence (OLAP) queries on event data. Druid provides low latency (real-time) data ingestion, flexible data exploration, and fast data aggregation. Existing Druid deployments have scaled to trillions of events and petabytes of data. Druid is best used to power analytic dashboards and applications.

Installation & Configuration

See in the documentation

Resources

Contributing

Interested in contributing? Casual hacking? Check out Contributing.MD

Who uses Apache Superset (incubating)?

Here's a list of organizations who have taken the time to send a PR to let the world know they are using Superset. Join our growing community!