🔍
Lariat Data
  • 👋Welcome to Lariat Data
  • Overview
    • 💡Video Overview
    • ✨Core Features
    • 🤓Glossary
  • Fundamentals
    • ⚙️Installation & Configuration
    • 📈Working with Datasets and Indicators
    • ☁️Platform Architecture
    • 🔓Your API & Application Keys
  • Integrations (Data Storage)
    • ⏏️S3 Object Storage
    • ⛄Iceberg
    • ⚛️AWS Athena
    • ❄️Snowflake
    • ⏏️GCS Object Storage
    • 🖥️AWS Redshift
    • 🖥️Google BigQuery
  • Integrations (Code)
    • 🐍Python
    • 💫Spark
    • ☕Java/JVM
Powered by GitBook
On this page
  1. Integrations (Data Storage)

AWS Athena

Instructions for installation and configuration of AWS Athena Agent

PreviousIcebergNextSnowflake

Last updated 1 year ago

AWS Athena is a serverless query engine, based on Presto, for querying data in Amazon S3.

Use this integration to build Indicators and Virtual Datasets on top of your databases defined in AWS Athena

Pre-requisites:

  • git

  • terraform-cli (> v1.2) : detailed installation instructions

  • You Lariat API Key and Application Key. Contact someone on the Lariat Team to get these.

Download the Athena Terraform Installer

git clone git@github.com:lariat-data/terraform-lariat-athena-agent.git

Run the installation

LARIAT_API_KEY=<your_api_key> LARIAT_APPLICATION_KEY=<your_app_key> terraform apply

During installation you create a yaml config. With this config you specify the databases and tables to track as well as a unique source id to identify metrics that come from this source.

The installer scopes permissions based on the databases selected. The installer puts the file onto your object storage and you can update this file directly to add further tables and databases and run the update command with the installer. You may use the wildcard * to match any table names with a chosen pattern.

Athena-specific configuration gets stored on your s3 cloud in a lariat created bucket. The bucket should start with lariat-athena-default-config and end in a timestamp.

Update this file with your chosen databases and tables to monitor, you may use the wildcard * to match any table names with a chosen pattern

databases:
  my_database_name:
    - catalog_sales_*
    - user_sessions
    - my_other_table

source_id:
  a_unique_source_id # e.g. lariat-aws-athena-us-east-1

⚛️
here