'Connecting athena to superset

So, I am using AWS athena where I have Data Source set to AwsDataCatalog, database set to test_db, under which I have a table named debaprc.

Now, I have superset installed on an EC2 instance (in virtual environment). On the Instance, I have installed PyAthenaJDBC and PyAthena. Now, when I launch Superset and try to add a database, the syntax given is this:

awsathena+rest://{aws_access_key_id}:{aws_secret_access_key}@athena.{region_name}.amazonaws.com/{schema_name}?s3_staging_dir={s3_staging_dir}

Now I have 2 questions -

  1. What do I provide for schema_name?
  2. I tried putting test_db as schema_name but it couldn't connect for some reason. Am I doing this right or do I need to do stuff differently?


Solution 1:[1]

Beware of the encoding:

awsathena+rest://{aws_access_key_id}:{aws_secret_access_key}@athena.{region_name}.amazonaws.com:443/{schema_name}?AwsRegion={region_name}&s3_staging_dir=s3%3A%2F%2Faws-athena-results-xxxxxxx

For example, for me it has been necessary to:

  • transform s3:// to s3%3A%2F%2F (and not just the : like in Superset doc?)
  • add the region again in the extra parameters

If you do not provide schema name (also called database), I think it defaults to a value of default

Sadly when a connection string fails on Superset, nothing very helpful is displayed...

Solution 2:[2]

It worked for me adding port 443 to the connection string as below and you can use test_db as schema_name:

awsathena+rest://{aws_access_key_id}:{aws_secret_access_key}@athena.{region_name}.amazonaws.com:443/{schema_name}?s3_staging_dir={s3_staging_dir}

Solution 3:[3]

Check PyAthena version. Superset docs tell PyAthena>1.2.0 while PyAthena PyPI page says PyAthena[SQLAlchemy]>=1.0.0, <2.0.0. In my case PyAthena[SQLAlchemy]>1.2.0, <2.0.0 (combining both constraints) solved an issue and the tables were present in dropdown list in SQL Lab (it was empty with PyAthena==2.5.1 (latest) version before).

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Pedro MS
Solution 2 Luan Felipe Ulrich
Solution 3 Anton Bryzgalov