Select Page

Amazon Data Zone

AWS

Amazon DataZone ensures secure, purpose-driven access to data, aligning with organizational security policies without relying on individual credentials. Provides transparency on asset usage and supports a governed workflow for approving data subscriptions. Offers monitoring of data assets across domains and projects using usage auditing capabilities.

Key Features:

  • Domains:
    • Scalable building blocks that organize resources according to business teams or lines of business (LOBs).
    • Allow creation of business-specific taxonomies using metadata forms and glossaries.
    • Govern data and control access through a domain’s associated resources.
  • Workflows for Publishing and Subscribing:
    • Facilitates decentralized data ownership and federated governance for data sharing.
    • Data producers publish and govern their data assets and configure subscription rules for consumers.
    • Data consumers access desired data after completing an approval workflow with data owners.
  • Fulfilling Grants on AWS Data:
    • Automatically manage permissions for AWS Lake Formation-managed AWS Glue tables and Amazon Redshift tables and views.
    • Emits standard events for other assets related to user actions (e.g., subscription requests or approvals).
    • Integrates with other AWS services or third-party solutions using these events for custom workflows.

Use Cases:

  • Control Data Sharing:
    • Abstracts the data sharing process between producers and consumers.
    • Domains manage access control through a subscription approval process that supports any account and AWS Region.
  • Understand Access Rights:
    • Allows creating business use case–based groupings for teams, tools, and data.
    • Enables self-service access to data and analytics tools, while administrators manage access centrally.
  • Organize by Business Units:
    • Securely organizes resources according to business-driven domains (e.g., LOBs).
    • Domains serve as scalable containers for Amazon DataZone objects like data assets, projects, and associated AWS accounts.
    • Provides a mechanism for organizational discipline in data governance and cataloging.

Amazon DataZone Capabilities:

  • Automate Catalog Hydration Using LLMs:
    • Uses large language models to automate the curation and hydration of the data catalog.
    • Auto-generates business names for structured data to facilitate data discovery.
  • Start Small and Scale Quickly:
    • Allows LOBs or teams to control their domains and share assets, fostering quick adoption and scalability across the organization.
  • Increase Productivity of Data People:
    • Promotes collaboration by enabling seamless switching between tools and integration with customized tools.

Integrations:

  • Producer Data Sources:
    • Publishes data from built-in sources like AWS Glue Data Catalog and Amazon Redshift.
    • Allows custom asset types and public APIs for other data sources.
    • Integrates with AWS Glue data quality for scheduled data quality scores.
  • Analytics Tools:
    • Works with Amazon Athena and Amazon Redshift Query Editor for direct data querying.
    • Extensible via APIs for third-party tool integration.
    • Shares project access context with these tools.
  • Access Fulfillment:
    • Automatically manages permissions for AWS Lake Formation-managed assets.
    • Emits standard events for other assets to facilitate integration with AWS or third-party solutions.
  • Machine Learning (ML) Tools:
    • Integrates with Amazon SageMaker to enable easy access to data and ML assets.
    • Supports ML governance and facilitates the publication of new ML assets to the business data catalog.

Latest Post:

Pin It on Pinterest