Amazon Athena is an interactive query service that enables SQL-based data analysis directly on Amazon S3.
- It’s serverless, requiring no infrastructure setup or management, and charges based on query use or computing needs.
- No infrastructure to manage or configure; Athena handles scaling, software updates, and maintenance automatically.
- Easy to start: Users can define schemas and run queries using a built-in editor or AWS Glue for automated data discovery and cataloging.
- Simplifies data analysis without the need for complex ETL jobs.
- Supports ANSI SQL and integrates with Trino and Presto for low-latency, interactive queries.
- Compatible with various data formats (CSV, JSON, ORC, Avro, Parquet) and offers federated query capabilities to integrate multiple data sources.
- Offers two pricing models: per query based on data scanned or capacity-based for compute resources, allowing for flexibility in cost management.
- Optimized for fast queries with parallel execution, delivering quick results even for large datasets.
- Utilizes multiple compute facilities and Amazon S3’s highly durable storage, ensuring data redundancy and high availability.
- Provides robust security controls using IAM policies, ACLs, S3 bucket policies, and supports querying encrypted data with both server-side and client-side encryption.
- Integrates with AWS Glue for metadata management, schema versioning, and ETL capabilities, enhancing data governance and query performance.
- Offers built-in connectors to various AWS, on-premises, and cloud data sources, enabling insights from multiple data locations without data movement.
- Supports invoking SageMaker ML models within SQL queries, enabling complex analytical tasks like anomaly detection and sales predictions using SQL syntax.