Secrets and Blocks
On this page, you will:
- Understand Prefect Blocks for configuration management
- Create blocks via Terraform
- Integrate with AWS Secrets Manager
- Connect to your S3 data lake buckets
Overview
Prefect Blocks are reusable configuration objects that store connection details, credentials, and settings for external systems. They keep sensitive data out of your code and make flows portable across environments.
┌─────────────────────────────────────────────────────────────────────────────┐
│ PREFECT BLOCKS │
├─────────────────────────────────────────────────────────────────────────────┤
│ │
│ ┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐ │
│ │ AWS Credentials │ │ Snowflake │ │ S3 Bucket │ │
│ │ │ │ Connector │ │ │ │
│ │ (IAM Role) │ │ account │ │ bucket_name │ │
│ │ │ │ user │ │ credentials │ │
│ │ │ │ password │ │ │ │
│ └────────┬────────┘ └────────┬────────┘ └────────┬────────┘ │
│ │ │ │ │
│ └────────────────────┼────────────────────┘ │
│ │ │
│ ▼ │
│ ┌─────────────────────┐ │
│ │ Your Flows │ │
│ │ block = Block.load │ │
│ └─────────────────────┘ │
│ │
└─────────────────────────────────────────────────────────────────────────────┘
Creating Blocks via Terraform
Blocks can be managed via Terraform using the Prefect provider, keeping your infrastructure as code.
Add Blocks to Prefect Terraform
Update your Prefect Terraform configuration to include blocks. Add blocks.tf to terraform/prefect/config/:
# =============================================================================
# Prefect Blocks
# =============================================================================
# Blocks for connecting to external systems.
# -----------------------------------------------------------------------------
# Data Sources - Reference AWS Resources
# -----------------------------------------------------------------------------
# Reference the S3 data lake buckets created in AWS Terraform
data "terraform_remote_state" "aws" {
backend = "s3"
config = {
bucket = "your-terraform-state-bucket"
key = "aws/terraform.tfstate"
region = "eu-west-2"
}
}
# -----------------------------------------------------------------------------
# AWS Credentials Block
# -----------------------------------------------------------------------------
# Uses IAM roles - no access keys needed when running on AWS
resource "prefect_block" "aws_credentials" {
name = "data-platform"
type_slug = "aws-credentials"
data = jsonencode({
region_name = var.aws_region
# When running on ECS/EC2 with IAM roles, credentials are automatic
# No access keys stored in the block
})
}
# -----------------------------------------------------------------------------
# S3 Data Lake Blocks
# -----------------------------------------------------------------------------
# Create blocks for each data lake bucket
resource "prefect_block" "s3_data_lake_dev" {
name = "data-lake-dev"
type_slug = "s3-bucket"
data = jsonencode({
bucket_name = data.terraform_remote_state.aws.outputs.data_lake_buckets["dev"].id
credentials_block = prefect_block.aws_credentials.id
})
}
resource "prefect_block" "s3_data_lake_staging" {
name = "data-lake-staging"
type_slug = "s3-bucket"
data = jsonencode({
bucket_name = data.terraform_remote_state.aws.outputs.data_lake_buckets["staging"].id
credentials_block = prefect_block.aws_credentials.id
})
}
resource "prefect_block" "s3_data_lake_prod" {
name = "data-lake-prod"
type_slug = "s3-bucket"
data = jsonencode({
bucket_name = data.terraform_remote_state.aws.outputs.data_lake_buckets["prod"].id
credentials_block = prefect_block.aws_credentials.id
})
}
# -----------------------------------------------------------------------------
# Outputs
# -----------------------------------------------------------------------------
output "block_aws_credentials" {
description = "AWS credentials block name"
value = prefect_block.aws_credentials.name
}
output "block_s3_data_lake_dev" {
description = "S3 data lake dev block name"
value = prefect_block.s3_data_lake_dev.name
}
output "block_s3_data_lake_prod" {
description = "S3 data lake prod block name"
value = prefect_block.s3_data_lake_prod.name
}
Deploy the Blocks
Commit and push to deploy via CI/CD:
git add terraform/prefect/config/blocks.tf
git commit -m "Add Prefect blocks for AWS and S3"
git push
Verify Blocks
After CI/CD completes:
# List all blocks
prefect block ls
# Check specific block types
prefect block ls --type s3-bucket
Using Blocks in Flows
S3 Block
from prefect import flow, task
from prefect_aws import S3Bucket
import json
@task
def upload_to_s3(data: dict, key: str):
"""Upload data to S3."""
s3 = S3Bucket.load("data-lake-dev")
s3.write_path(
path=key,
content=json.dumps(data).encode()
)
@task
def download_from_s3(key: str) -> dict:
"""Download data from S3."""
s3 = S3Bucket.load("data-lake-dev")
content = s3.read_path(path=key)
return json.loads(content)
@flow
def s3_flow():
# Upload data
data = {"key": "value", "timestamp": "2024-01-15T10:30:00Z"}
upload_to_s3(data, "events/2024/01/15/event_001.json")
# Download data
retrieved = download_from_s3("events/2024/01/15/event_001.json")
print(f"Retrieved: {retrieved}")
AWS Credentials Block
from prefect import task
from prefect_aws import AwsCredentials
import boto3
@task
def get_secret(secret_name: str) -> dict:
"""Fetch a secret from AWS Secrets Manager."""
aws_creds = AwsCredentials.load("data-platform")
client = boto3.client(
"secretsmanager",
region_name=aws_creds.region_name,
# Credentials are automatic when using IAM roles
)
response = client.get_secret_value(SecretId=secret_name)
return json.loads(response["SecretString"])
AWS Secrets Manager Integration
For secrets like database passwords, fetch from AWS Secrets Manager at runtime rather than storing in blocks.
Pattern: Fetch Secrets in Flows
import json
import boto3
from prefect import task, flow, get_run_logger
@task
def get_secret(secret_name: str, region: str = "eu-west-2") -> dict:
"""Fetch a secret from AWS Secrets Manager."""
logger = get_run_logger()
# Uses IAM role credentials automatically on ECS/EC2
client = boto3.client("secretsmanager", region_name=region)
response = client.get_secret_value(SecretId=secret_name)
secret_value = response["SecretString"]
logger.info(f"Retrieved secret: {secret_name}")
try:
return json.loads(secret_value)
except json.JSONDecodeError:
return {"value": secret_value}
@flow
def flow_with_secrets():
# Fetch Snowflake credentials at runtime
snowflake_creds = get_secret("terraform/snowflake-credentials")
# Use the credentials
account = snowflake_creds["account"]
user = snowflake_creds["user"]
# ...
Reusable Secrets Module
Create a reusable module in your data-pipelines repository:
# utils/secrets.py
import json
import boto3
from functools import lru_cache
@lru_cache(maxsize=32)
def get_secret(secret_name: str, region: str = "eu-west-2") -> dict:
"""
Fetch a secret from AWS Secrets Manager.
Uses caching to avoid repeated API calls within a flow run.
"""
client = boto3.client("secretsmanager", region_name=region)
response = client.get_secret_value(SecretId=secret_name)
try:
return json.loads(response["SecretString"])
except json.JSONDecodeError:
return {"value": response["SecretString"]}
Snowflake Block (Optional)
If you want to create a Snowflake connector block, add to blocks.tf:
# Fetch Snowflake credentials from Secrets Manager
data "aws_secretsmanager_secret_version" "snowflake" {
secret_id = "terraform/snowflake-credentials"
}
locals {
snowflake_creds = jsondecode(data.aws_secretsmanager_secret_version.snowflake.secret_string)
}
resource "prefect_block" "snowflake_connector" {
name = "snowflake-analytics"
type_slug = "snowflake-connector"
data = jsonencode({
account = local.snowflake_creds.account
user = local.snowflake_creds.user
password = local.snowflake_creds.password
role = "ANALYTICS_TRANSFORMER"
warehouse = "TRANSFORMING"
database = "ANALYTICS"
})
}
Secrets in Terraform State
When fetching secrets in Terraform, they appear in state. For production, consider fetching credentials at flow runtime instead of storing them in blocks.
Using Snowflake Block in Flows
from prefect import flow, task
from prefect_snowflake import SnowflakeConnector
@task
def query_snowflake(query: str) -> list[dict]:
"""Execute a query against Snowflake."""
connector = SnowflakeConnector.load("snowflake-analytics")
with connector.get_connection() as conn:
cursor = conn.cursor()
cursor.execute(query)
columns = [col[0] for col in cursor.description]
rows = cursor.fetchall()
return [dict(zip(columns, row)) for row in rows]
@flow
def snowflake_flow():
results = query_snowflake("SELECT * FROM analytics.reporting.summary LIMIT 10")
print(f"Got {len(results)} rows")
Managing Blocks
List Blocks
# List all blocks
prefect block ls
# List blocks of a specific type
prefect block ls --type s3-bucket
Delete Blocks
prefect block delete s3-bucket/data-lake-dev
Best Practices
1. Use Terraform for Block Management
Managing blocks via Terraform ensures:
- Blocks are version controlled
- Changes are reviewed in PRs
- Consistent across environments
2. Use IAM Roles Instead of Access Keys
# Good: Uses IAM role automatically
aws_creds = AwsCredentials(region_name="eu-west-2")
# Avoid: Storing access keys
aws_creds = AwsCredentials(
aws_access_key_id="AKIA...", # Don't do this
aws_secret_access_key="...",
)
3. Fetch Sensitive Data at Runtime
# Good: Fetch from Secrets Manager at runtime
creds = get_secret("snowflake/credentials")
# Avoid: Hard-coding credentials
password = "my-secret-password" # Don't do this!
4. Create Environment-Specific Blocks
Use naming conventions to distinguish environments:
data-lake-dev
data-lake-staging
data-lake-prod
Summary
You've learned how to manage secrets and configuration with Prefect Blocks:
- Created blocks via Terraform
- Connected to S3 data lake buckets
- Integrated with AWS Secrets Manager
- Used blocks in flows
What's Next
You've completed the core Prefect setup. The final page summarises what you've built and outlines next steps.
Continue to Finishing Up →