Skip to content

Maintain Your Data Stack

You have built a complete, production-ready data stack spanning infrastructure, ingestion, transformation, analytics, and observability. This section covers how to keep it running, evolve it over time, and handle routine maintenance tasks - with or without AI agents.

What You'll Learn

This section focuses on day-to-day operations - adding new resources, onboarding team members, and keeping your platform healthy:

  • Adding new users, data sources, and models
  • Backfill strategies and performance optimisation
  • Disaster recovery and security hardening
  • Troubleshooting common issues

Claude Code Setup

If you haven't already, set up CLAUDE.md files and skills for your repositories so Claude Code can assist with these maintenance tasks. See the Claude Code Setup page in the Getting Started section.

Your Three Repositories

By now you have three repositories that work together:

GitHub Organisation
├── terraform/           Infrastructure as code
│   ├── github/          GitHub organisation, teams, users
│   ├── aws/             S3, IAM, VPC, Secrets Manager
│   └── snowflake/       Warehouses, databases, roles, users
│
├── data-pipelines/      Ingestion and orchestration
│   ├── sources/         dlt source definitions
│   ├── pipelines/       dlt pipeline configurations
│   └── flows/           Prefect flow definitions
│
└── dbt-transform/       Data transformation
    └── models/
        ├── staging/     Clean raw data (views)
        ├── intermediate/ Business logic (ephemeral)
        ├── marts/       Analytics tables (tables/incremental)
        └── reporting/   BI-facing subset (views)

Each repository has its own conventions, module patterns, and safety rules. Maintaining them means understanding these patterns - or having an AI agent that already does.

Runbooks

These pages follow the runbook structure - each one is a step-by-step operational procedure with verification, rollback, and escalation paths.

Runbook When to Use
Adding Users New team member or service account needs access
Adding Data Sources New API, database, or SaaS tool to ingest
Backfills Historical data needs reprocessing
Performance Optimisation Slow queries, long dbt runs, or credit spikes
Disaster Recovery Data loss, state corruption, or service outages
Security Hardening Key rotation, access reviews, or audit
Upgrades Snowflake, dbt, Prefect, or provider version updates
Troubleshooting Something is broken and the cause is unclear

Prerequisites

Before starting this section, ensure you have completed: