Skip to content

cultureamp/dbt-athena

 
 

Repository files navigation

dbt-athena

Installation

pip install git+https://github.com/Tomme/dbt-athena.git

Configuring your profile

A dbt profile can be configured to run against AWS Athena using the following configuration:

Option Description Required? Example
s3_staging_dir S3 location to store Athena query results and metadata Required s3://bucket/dbt/
region_name AWS region of your Athena instance Required eu-west-1
schema Specify the schema (Athena database) to build models into (lowercase only) Required dbt
database Specify the database (Data catalog) to build models into (lowercase only) Required awsdatacatalog

Example profiles.yml entry:

athena:
  target: dev
  outputs:
    dev:
      type: athena
      s3_staging_dir: s3://athena-query-results/dbt/
      region_name: eu-west-1
      schema: dbt
      database: awsdatacatalog

Additional information

  • threads is supported
  • database and catalog can be used interchangeably

Usage notes

Models

Table Configuration

  • external_location (default=none)
    • The location where Athena saves your table in Amazon S3
    • If none then it will default to {s3_staging_dir}/tables
    • Note If you are using a static value, when your table is recreated Athena will not remove the underlying redundant data causing a HIVE_PATH_ALREADY_EXISTS error. Ideally you should set a dynamic value to avoid this issue. Here are some example of dynamic values.
      • external_location="s3://dbt-tables/example-directory/" + run_started_at.isoformat()
      • external_location="s3://dbt-tables/example-directory/" + invocation_id
  • partitioned_by (default=none)
    • An array list of columns by which the table will be partitioned
    • Limited to creation of 100 partitions (currently)
  • bucketed_by (default=none)
    • An array list of columns to bucket data
  • bucket_count (default=none)
    • The number of buckets for bucketing your data
  • format (default='parquet')
    • The data format for the table
    • Supports ORC, PARQUET, AVRO, JSON, or TEXTFILE
  • field_delimiter (default=none)
    • Custom field delimiter, for when format is set to TEXTFILE

More information: CREATE TABLE AS

Supported functionality

Experimental support for incremental models.

  • Only inserts
  • Does not support the use of unique_key

Due to the nature of AWS Athena, not all core dbt functionality is supported. The following features of dbt are not implemented on Athena:

  • Snapshots

Known issues

Running tests

First, install the adapter and its dependencies using make (see Makefile):

make install_deps

Next, configure the environment variables in dev.env to match your Athena development environment. Finally, run the tests using make:

make run_tests

Community

About

The athena adapter plugin for dbt (https://getdbt.com)

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 96.7%
  • Makefile 1.7%
  • Shell 1.6%