Amazon Athena

Architectural designs, AWS

Amazon Athena is an interactive query service that enables SQL-based data analysis directly on Amazon S3.

It’s serverless, requiring no infrastructure setup or management, and charges based on query use or computing needs.
No infrastructure to manage or configure; Athena handles scaling, software updates, and maintenance automatically.
Easy to start: Users can define schemas and run queries using a built-in editor or AWS Glue for automated data discovery and cataloging.
Simplifies data analysis without the need for complex ETL jobs.
Supports ANSI SQL and integrates with Trino and Presto for low-latency, interactive queries.
Compatible with various data formats (CSV, JSON, ORC, Avro, Parquet) and offers federated query capabilities to integrate multiple data sources.
Offers two pricing models: per query based on data scanned or capacity-based for compute resources, allowing for flexibility in cost management.
Optimized for fast queries with parallel execution, delivering quick results even for large datasets.
Utilizes multiple compute facilities and Amazon S3’s highly durable storage, ensuring data redundancy and high availability.
Provides robust security controls using IAM policies, ACLs, S3 bucket policies, and supports querying encrypted data with both server-side and client-side encryption.
Integrates with AWS Glue for metadata management, schema versioning, and ETL capabilities, enhancing data governance and query performance.
Offers built-in connectors to various AWS, on-premises, and cloud data sources, enabling insights from multiple data locations without data movement.
Supports invoking SageMaker ML models within SQL queries, enabling complex analytical tasks like anomaly detection and sales predictions using SQL syntax.