superset/docs/installation.rst

288 lines
11 KiB
ReStructuredText
Raw Normal View History

2016-03-16 23:25:41 -04:00
Installation & Configuration
============================
Getting Started
---------------
Caravel is tested using Python 2.7 and Python 3.4+. Python 3 is the recommended version,
Python 2.6 won't be supported.
2016-03-16 23:25:41 -04:00
OS dependencies
---------------
Caravel stores database connection information in its metadata database.
For that purpose, we use the ``cryptography`` Python library to encrypt
connection passwords. Unfortunately this library has OS level dependencies.
You may want to attempt the next step
("Caravel installation and initialization") and come back to this step if
you encounter an error.
Here's how to install them:
2016-04-01 11:37:19 -04:00
For **Debian** and **Ubuntu**, the following command will ensure that
the required dependencies are installed: ::
2016-04-06 21:11:24 -04:00
sudo apt-get install build-essential libssl-dev libffi-dev python-dev python-pip
2016-04-01 11:37:19 -04:00
For **Fedora** and **RHEL-derivatives**, the following command will ensure
that the required dependencies are installed: ::
2016-04-06 21:11:24 -04:00
sudo yum upgrade python-setuptools
sudo yum install gcc libffi-devel python-devel python-pip python-wheel openssl-devel
2016-04-01 11:37:19 -04:00
2016-04-06 21:11:24 -04:00
**OSX**, system python is not recommended. brew's python also ships with pip ::
2016-04-01 11:37:19 -04:00
2016-04-06 21:11:24 -04:00
brew install pkg-config libffi openssl python
2016-04-01 11:37:19 -04:00
env LDFLAGS="-L$(brew --prefix openssl)/lib" CFLAGS="-I$(brew --prefix openssl)/include" pip install cryptography
**Windows** isn't officially supported at this point, but if you want to
2016-04-06 21:11:24 -04:00
attempt it, download `get-pip.py <https://bootstrap.pypa.io/get-pip.py>`_, and run ``python get-pip.py`` which may need admin access. Then run the following: ::
C:\> pip install cryptography
2016-04-06 11:23:27 -04:00
# You may also have to create C:\Temp
C:\> md C:\Temp
Python virtualenv
-----------------
It is recommended to install Caravel inside a virtualenv. Python 3 already ships virtualenv, for
Python 2 you need to install it. If it's packaged for your operating systems install it from there
otherwise you can install from pip: ::
pip install virtualenv
You can create and activate a virtualenv by: ::
# virtualenv is shipped in Python 3 as pyvenv
virtualenv venv
. ./venv/bin/activate
On windows the syntax for activating it is a bit different: ::
venv\Scripts\activate
Once you activated your virtualenv everything you are doing is confined inside the virtualenv.
To exit a virtualenv just type ``deactivate``.
Caravel installation and initialization
---------------------------------------
2016-03-30 00:24:01 -04:00
Follow these few simple steps to install Caravel.::
2016-03-16 23:25:41 -04:00
2016-03-30 00:24:01 -04:00
# Install caravel
pip install caravel
2016-03-16 23:25:41 -04:00
# Create an admin user
2016-03-30 00:24:01 -04:00
fabmanager create-admin --app caravel
2016-03-16 23:25:41 -04:00
# Initialize the database
2016-03-30 00:24:01 -04:00
caravel db upgrade
2016-03-16 23:25:41 -04:00
# Create default roles and permissions
2016-03-30 00:24:01 -04:00
caravel init
2016-03-16 23:25:41 -04:00
# Load some data to play with
2016-03-30 00:24:01 -04:00
caravel load_examples
2016-03-16 23:25:41 -04:00
# Start the web server on port 8088
caravel runserver -p 8088
# To start a development web server, use the -d switch
# caravel runserver -d
2016-03-16 23:25:41 -04:00
After installation, you should be able to point your browser to the right
hostname:port `http://localhost:8088 <http://localhost:8088>`_, login using
2016-03-16 23:25:41 -04:00
the credential you entered while creating the admin account, and navigate to
`Menu -> Admin -> Refresh Metadata`. This action should bring in all of
2016-03-30 00:24:01 -04:00
your datasources for Caravel to be aware of, and they should show up in
2016-03-16 23:25:41 -04:00
`Menu -> Datasources`, from where you can start playing with your data!
Configuration behind a load balancer
------------------------------------
If you are running caravel behind a load balancer or reverse proxy (e.g. NGINX
or ELB on AWS), you may need to utilise a healthcheck endpoint so that your
load balancer knows if your caravel instance is running. This is provided
at ``/health`` which will return a 200 response containing "OK" if the
webserver is running.
2016-03-16 23:25:41 -04:00
Configuration
-------------
To configure your application, you need to create a file (module)
2016-03-30 00:24:01 -04:00
``caravel_config.py`` and make sure it is in your PYTHONPATH. Here are some
2016-03-16 23:25:41 -04:00
of the parameters you can copy / paste in that configuration module: ::
#---------------------------------------------------------
# Caravel specific config
2016-03-16 23:25:41 -04:00
#---------------------------------------------------------
ROW_LIMIT = 5000
CARAVEL_WORKERS = 16
2016-03-16 23:25:41 -04:00
2016-03-30 00:24:01 -04:00
CARAVEL_WEBSERVER_PORT = 8088
2016-03-16 23:25:41 -04:00
#---------------------------------------------------------
#---------------------------------------------------------
# Flask App Builder configuration
#---------------------------------------------------------
# Your App secret key
SECRET_KEY = '\2\1thisismyscretkey\1\2\e\y\y\h'
# The SQLAlchemy connection string to your database backend
# This connection defines the path to the database that stores your
# caravel metadata (slices, connections, tables, dashboards, ...).
# Note that the connection information to connect to the datasources
# you want to explore are managed directly in the web UI
2016-03-30 00:24:01 -04:00
SQLALCHEMY_DATABASE_URI = 'sqlite:////tmp/caravel.db'
2016-03-16 23:25:41 -04:00
# Flask-WTF flag for CSRF
CSRF_ENABLED = True
This file also allows you to define configuration parameters used by
2016-03-30 00:24:01 -04:00
Flask App Builder, the web framework used by Caravel. Please consult
2016-03-16 23:25:41 -04:00
the `Flask App Builder Documentation
<http://flask-appbuilder.readthedocs.org/en/latest/config.html>`_
2016-03-30 00:24:01 -04:00
for more information on how to configure Caravel.
2016-03-16 23:25:41 -04:00
2016-04-08 11:44:28 -04:00
Database dependencies
---------------------
Caravel does not ship bundled with connectivity to databases, except
for Sqlite, which is part of the Python standard library.
You'll need to install the required packages for the database you
want to use as your metadata database as well as the packages needed to
connect to the databases you want to access through Caravel.
Here's a list of some of the recommended packages.
+---------------+-------------------------------------+-------------------------------------------------+
| database | pypi package | SQLAlchemy URI prefix |
+===============+=====================================+=================================================+
| MySQL | ``pip install mysqlclient`` | ``mysql://`` |
+---------------+-------------------------------------+-------------------------------------------------+
| Postgres | ``pip install psycopg2`` | ``postgresql+psycopg2://`` |
+---------------+-------------------------------------+-------------------------------------------------+
| Presto | ``pip install pyhive`` | ``presto://`` |
+---------------+-------------------------------------+-------------------------------------------------+
| Oracle | ``pip install cx_Oracle`` | ``oracle://`` |
+---------------+-------------------------------------+-------------------------------------------------+
| sqlite | | ``sqlite://`` |
+---------------+-------------------------------------+-------------------------------------------------+
| Redshift | ``pip install sqlalchemy-redshift`` | ``redshift+psycopg2://`` |
+---------------+-------------------------------------+-------------------------------------------------+
| MSSQL | ``pip install pymssql`` | ``mssql://`` |
+---------------+-------------------------------------+-------------------------------------------------+
| Impala | ``pip install impyla`` | ``impala://`` |
+---------------+-------------------------------------+-------------------------------------------------+
| SparkSQL | ``pip install pyhive`` | ``jdbc+hive://`` |
+---------------+-------------------------------------+-------------------------------------------------+
2016-04-08 11:44:28 -04:00
Note that many other database are supported, the main criteria being the
existence of a functional SqlAlchemy dialect and Python driver. Googling
the keyword ``sqlalchemy`` in addition of a keyword that describes the
database you want to connect to should get you to the right place.
2016-03-16 23:25:41 -04:00
Caching
-------
2016-03-30 00:24:01 -04:00
Caravel uses `Flask-Cache <https://pythonhosted.org/Flask-Cache/>`_ for
2016-03-16 23:25:41 -04:00
caching purpose. Configuring your caching backend is as easy as providing
2016-03-30 00:24:01 -04:00
a ``CACHE_CONFIG``, constant in your ``caravel_config.py`` that
2016-03-16 23:25:41 -04:00
complies with the Flask-Cache specifications.
Flask-Cache supports multiple caching backends (Redis, Memcached,
SimpleCache (in-memory), or the local filesystem). If you are going to use
Memcached please use the pylibmc client library as python-memcached does
2016-08-02 02:04:26 -04:00
not handle storing binary data correctly. If you use Redis, please install
[python-redis](https://pypi.python.org/pypi/redis).
2016-03-16 23:25:41 -04:00
2016-03-30 00:24:01 -04:00
For setting your timeouts, this is done in the Caravel metadata and goes
2016-03-16 23:25:41 -04:00
up the "timeout searchpath", from your slice configuration, to your
data source's configuration, to your database's and ultimately falls back
into your global default defined in ``CACHE_CONFIG``.
2016-04-06 11:45:27 -04:00
Deeper SQLAlchemy integration
-----------------------------
It is possible to tweak the database connection information using the
parameters exposed by SQLAlchemy. In the ``Database`` edit view, you will
find an ``extra`` field as a ``JSON`` blob.
.. image:: _static/img/tutorial/add_db.png
2016-04-08 11:44:28 -04:00
:scale: 30 %
2016-04-06 11:45:27 -04:00
This JSON string contains extra configuration elements. The ``engine_params``
object gets unpacked into the
`sqlalchemy.create_engine <http://docs.sqlalchemy.org/en/latest/core/engines.html#sqlalchemy.create_engine>`_ call,
while the ``metadata_params`` get unpacked into the
`sqlalchemy.MetaData <http://docs.sqlalchemy.org/en/rel_1_0/core/metadata.html#sqlalchemy.schema.MetaData>`_ call. Refer to the SQLAlchemy docs for more information.
Schemas (Postgres & Redshift)
-----------------------------
2016-04-06 11:45:27 -04:00
Postgres and Redshift, as well as other database,
use the concept of **schema** as a logical entity
2016-04-06 11:45:27 -04:00
on top of the **database**. For Caravel to connect to a specific schema,
there's a **schema** parameter you can set in the table form.
2016-04-06 11:45:27 -04:00
SSL Access to databases
-----------------------
This example worked with a MySQL database that requires SSL. The configuration
may differ with other backends. This is what was put in the ``extra``
parameter ::
{
"metadata_params": {},
"engine_params": {
"connect_args":{
"sslmode":"require",
"sslrootcert": "/path/to/my/pem"
}
}
}
2016-03-16 23:25:41 -04:00
Druid
-----
* From the UI, enter the information about your clusters in the
2016-04-04 13:36:51 -04:00
``Admin->Clusters`` menu by hitting the + sign.
2016-03-16 23:25:41 -04:00
* Once the Druid cluster connection information is entered, hit the
2016-04-04 13:36:51 -04:00
``Admin->Refresh Metadata`` menu item to populate
2016-03-16 23:25:41 -04:00
* Navigate to your datasources
2016-04-06 21:02:15 -04:00
Note that you can run the ``caravel refresh_druid`` command to refresh the
metadata from your Druid cluster(s)
CORS
-----
The extra CORS Dependency must be installed:
caravel[cors]
The following keys in `caravel_config.py` can be specified to configure CORS:
* ``ENABLE_CORS``: Must be set to True in order to enable CORS
* ``CORS_OPTIONS``: options passed to Flask-CORS (`documentation <http://flask-cors.corydolphin.com/en/latest/api.html#extension>`)
2016-04-06 21:02:15 -04:00
Upgrading
---------
Upgrading should be as straightforward as running::
pip install caravel --upgrade
caravel db upgrade