docs: update AWS Athena and Redshift docs (#24751)

This commit is contained in:
Multazim Deshmukh 2023-07-24 10:04:14 +05:30 committed by GitHub
parent 0631a8086c
commit b8a3eeffdb
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
2 changed files with 46 additions and 8 deletions

View File

@ -26,17 +26,14 @@ s3://... -> s3%3A//...
### PyAthena
You can also use [PyAthena library](https://pypi.org/project/PyAthena/) (no Java required) with the
You can also use the [PyAthena library](https://pypi.org/project/PyAthena/) (no Java required) with the
following connection string:
```
awsathena+rest://{aws_access_key_id}:{aws_secret_access_key}@athena.{region_name}.amazonaws.com/{schema_name}?s3_staging_dir={s3_staging_dir}&...
```
The PyAthena library also allows to assume a specific IAM role, by [importing the datasource from YAML](https://superset.apache.org/docs/miscellaneous/importing-exporting-datasources/#importing-datasources-from-yaml) and passing extra parameters:
The PyAthena library also allows to assume a specific IAM role which you can define by adding following parameters in Superset's Athena database connection UI under ADVANCED --> Other --> ENGINE PARAMETERS.
```
databases:
- database_name: awsathena
sqlalchemy_uri: awsathena+rest://athena.{region_name}.amazonaws.com/{schema_name}?s3_staging_dir={s3_staging_dir}&...
extra: "{\"engine_params\": {\"connect_args\": {\"role_arn\": \"{{ ROLE_ARN }}\" }}}"
{"connect_args":{"role_arn":"<role arn>"}}
```

View File

@ -10,7 +10,9 @@ version: 1
The [sqlalchemy-redshift](https://pypi.org/project/sqlalchemy-redshift/) library is the recommended
way to connect to Redshift through SQLAlchemy.
You'll need to the following setting values to form the connection string:
This dialect requires either [redshift_connector](https://pypi.org/project/redshift-connector/) or [psycopg2](https://pypi.org/project/psycopg2/) to work properly.
You'll need to set the following values to form the connection string:
- **User Name**: userName
- **Password**: DBPassword
@ -18,8 +20,47 @@ You'll need to the following setting values to form the connection string:
- **Database Name**: Database Name
- **Port**: default 5439
Here's what the connection string looks like:
### psycopg2
Here's what the SQLALCHEMY URI looks like:
```
redshift+psycopg2://<userName>:<DBPassword>@<AWS End Point>:5439/<Database Name>
```
### redshift_connector
Here's what the SQLALCHEMY URI looks like:
```
redshift+redshift_connector://<userName>:<DBPassword>@<AWS End Point>:5439/<Database Name>
```
#### Using IAM-based credentials with Redshift cluster:
[Amazon redshift cluster](https://docs.aws.amazon.com/redshift/latest/mgmt/working-with-clusters.html) also supports generating temporary IAM-based database user credentials.
Your superset app's [IAM role should have permissions](https://docs.aws.amazon.com/redshift/latest/mgmt/generating-iam-credentials-role-permissions.html) to call the `redshift:GetClusterCredentials` operation.
You have to define the following arguments in Superset's redshift database connection UI under ADVANCED --> Others --> ENGINE PARAMETERS.
```
{"connect_args":{"iam":true,"database":"<database>","cluster_identifier":"<cluster_identifier>","db_user":"<db_user>"}}
```
and SQLALCHEMY URI should be set to `redshift+redshift_connector://`
#### Using IAM-based credentials with Redshift serverless:
[Redshift serverless](https://docs.aws.amazon.com/redshift/latest/mgmt/serverless-whatis.html) supports connection using IAM roles.
Your superset app's IAM role should have `redshift-serverless:GetCredentials` and `redshift-serverless:GetWorkgroup` permissions on Redshift serverless workgroup.
You have to define the following arguments in Superset's redshift database connection UI under ADVANCED --> Others --> ENGINE PARAMETERS.
```
{"connect_args":{"iam":true,"is_serverless":true,"serverless_acct_id":"<aws account number>","serverless_work_group":"<redshift work group>","database":"<database>","user":"IAMR:<superset iam role name>"}}
```