Added FAQ and db dependencies to docs

This commit is contained in:
Maxime Beauchemin 2016-04-08 08:44:28 -07:00
parent eff0beb195
commit 0afa5d2cba
3 changed files with 70 additions and 1 deletions

34
docs/faq.rst Normal file
View File

@ -0,0 +1,34 @@
FAQ
===
Can I query/join multiple tables at one time?
---------------------------------------------
Not directly no. A Caravel SQLAlchemy datasource can only be a single table
or a view.
When working with tables, the solution would be to materialize
a table that contains all the fields needed for your analysis, most likely
through some scheduled batch process.
A view is a simple logical layer that abstract an arbitrary SQL queries as
a virtual table. This can allow you to join and union multiple tables, and
to apply some transformation using arbitrary SQL expressions. The limitation
there is your database performance as Caravel effectively will run a query
on top of your query (view). A good practice may be to limit yourself to
joining your main large table to one or many small tables only, and avoid
using ``GROUP BY`` where possible as Caravel will do its own ``GROUP BY`` and
doing the work twice might slow down performance.
Whether you use a table or a view, the important factor is whether your
database is fast enough to serve it in an interactive fashion to provide
a good user experience in Caravel.
How BIG can my data source be?
------------------------------
It can be gigantic! As mentioned above, the main criteria is whether your
database can execute queries and return results in a time frame that is
acceptable to your users. Many distributed databases out there can execute
queries that scan through terabytes in an interactive fashion.

View File

@ -34,6 +34,7 @@ Contents
tutorial
videos
gallery
faq
Indices and tables

View File

@ -121,6 +121,40 @@ the `Flask App Builder Documentation
<http://flask-appbuilder.readthedocs.org/en/latest/config.html>`_
for more information on how to configure Caravel.
Database dependencies
---------------------
Caravel does not ship bundled with connectivity to databases, except
for Sqlite, which is part of the Python standard library.
You'll need to install the required packages for the database you
want to use as your metadata database as well as the packages needed to
connect to the databases you want to access through Caravel.
Here's a list of some of the recommended packages.
+---------------+-------------------------------------+-------------------------------------------------+
| database | pypi package | SQLAlchemy URI prefix |
+===============+=====================================+=================================================+
| MySQL | ``pip install mysqlclient`` | ``mysql://`` |
+---------------+-------------------------------------+-------------------------------------------------+
| Postgres | ``pip install psycopg2`` | ``postgresql+psycopg2://`` |
+---------------+-------------------------------------+-------------------------------------------------+
| Presto | ``pip install pyhive`` | ``presto://`` |
+---------------+-------------------------------------+-------------------------------------------------+
| Oracle | ``pip install cx_Oracle`` | ``oracle://`` |
+---------------+-------------------------------------+-------------------------------------------------+
| sqlite | | ``sqlite://`` |
+---------------+-------------------------------------+-------------------------------------------------+
| Redshift | ``pip install sqlalchemy-redshift`` | ``redshift+psycopg2://`` |
+---------------+-------------------------------------+-------------------------------------------------+
| MSSQL | ``pip install pymssql`` | ``mssql://`` |
+---------------+-------------------------------------+-------------------------------------------------+
Note that many other database are supported, the main criteria being the
existence of a functional SqlAlchemy dialect and Python driver. Googling
the keyword ``sqlalchemy`` in addition of a keyword that describes the
database you want to connect to should get you to the right place.
Caching
-------
@ -147,7 +181,7 @@ parameters exposed by SQLAlchemy. In the ``Database`` edit view, you will
find an ``extra`` field as a ``JSON`` blob.
.. image:: _static/img/tutorial/add_db.png
:scale: 50 %
:scale: 30 %
This JSON string contains extra configuration elements. The ``engine_params``
object gets unpacked into the