Snowflake Monitoring
On this page, you will:
- Profile slow queries and identify optimisation opportunities
- Right-size warehouses based on usage patterns
- Set up Resource Monitors to prevent budget overruns
- Monitor credit consumption by warehouse and user
- Understand Snowflake's query performance metrics
Overview
Snowflake provides comprehensive monitoring capabilities for compute usage, query performance, and costs. Effective Snowflake monitoring ensures:
- Cost control — Prevent surprise bills from runaway queries or over-provisioned warehouses
- Performance optimisation — Identify and fix slow queries
- Capacity planning — Right-size warehouses for workload demands
- Usage attribution — Track who is using what resources
┌─────────────────────────────────────────────────────────────────────────┐
│ SNOWFLAKE MONITORING LAYERS │
├─────────────────────────────────────────────────────────────────────────┤
│ │
│ Query Performance Warehouse Usage Cost Monitoring │
│ ────────────────── ─────────────── ─────────────── │
│ │
│ ┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐ │
│ │ QUERY_HISTORY │ │ WAREHOUSE_ │ │ Resource │ │
│ │ • Duration │ │ METERING_ │ │ Monitors │ │
│ │ • Bytes scanned │ │ HISTORY │ │ • Budget alerts │ │
│ │ │ Partitions │ │ • Credit usage │ │ • Auto-suspend │ │
│ │ pruned │ │ • Utilisation │ │ • Notifications │ │
│ └─────────────────┘ └─────────────────┘ └─────────────────┘ │
│ │ │ │ │
│ └───────────────────────┴───────────────────────┘ │
│ │ │
│ ▼ │
│ ┌────────────────────┐ │
│ │ ACCOUNT_USAGE │ │
│ │ (System Views) │ │
│ │ • 1 year retention │ │
│ │ • All accounts │ │
│ └────────────────────┘ │
│ │
└─────────────────────────────────────────────────────────────────────────┘
Query Profiling
Snowflake's QUERY_HISTORY view shows all queries executed, with performance metrics.
View Recent Queries
USE ROLE ACCOUNTADMIN;
SELECT
query_id,
query_text,
user_name,
warehouse_name,
execution_status,
total_elapsed_time / 1000 AS duration_seconds,
bytes_scanned,
rows_produced,
credits_used_cloud_services,
start_time
FROM snowflake.account_usage.query_history
WHERE start_time >= DATEADD(day, -7, CURRENT_TIMESTAMP())
ORDER BY total_elapsed_time DESC
LIMIT 20;
Key columns:
- total_elapsed_time — Query duration in milliseconds
- bytes_scanned — Data read from storage
- rows_produced — Rows returned
- partitions_scanned — Micro-partitions read
- partitions_total — Total micro-partitions in table (for pruning ratio)
Identify Slow Queries
Find queries taking >60 seconds:
SELECT
query_id,
LEFT(query_text, 100) AS query_preview,
user_name,
warehouse_name,
total_elapsed_time / 1000 AS duration_seconds,
bytes_scanned / POW(1024, 3) AS gb_scanned,
start_time
FROM snowflake.account_usage.query_history
WHERE start_time >= DATEADD(day, -7, CURRENT_TIMESTAMP())
AND total_elapsed_time > 60000 -- 60 seconds
AND execution_status = 'SUCCESS'
ORDER BY total_elapsed_time DESC
LIMIT 10;
Example output:
| query_id | query_preview | user_name | warehouse_name | duration_seconds | gb_scanned |
|---|---|---|---|---|---|
01a... |
SELECT * FROM fct_orders WHERE ... |
SVC_DBT | TRANSFORMING | 245.3 | 120.5 |
01b... |
CREATE TABLE staging.temp_... |
ALICE | ANALYTICS_WH | 180.2 | 85.2 |
Next steps: Analyse these queries for optimisation (see below).
Analyse Query Profile
For a specific slow query:
- In Snowsight, navigate to Activity → Query History
- Find the query by
query_idor time range - Click on the query → Query Profile
Query Profile shows: - Execution time breakdown — time spent on each operator (scan, join, aggregate) - Data scanned — bytes read from storage - Partition pruning — how many partitions were skipped - Spilling — whether data spilled to disk (bad for performance)
Common Performance Issues
Issue 1: Full Table Scan
Symptom: partitions_scanned = partitions_total (no pruning)
Cause: Query lacks filtering on clustering key or partition column.
Example:
-- BAD: Full table scan
SELECT * FROM fct_orders
WHERE customer_name = 'Alice'; -- customer_name not clustered
-- Scans all partitions (e.g., 1000 partitions for 1GB table)
Fix: Add clustering key or filter on partition column:
-- GOOD: Partition pruning
SELECT * FROM fct_orders
WHERE order_date >= '2026-01-01' -- order_date is clustering key
AND customer_name = 'Alice';
-- Scans only partitions for Jan 2026+ (e.g., 50 partitions)
Issue 2: Excessive Data Spilling
Symptom: Query Profile shows "Bytes spilled to local storage" or "Bytes spilled to remote storage"
Cause: Warehouse too small for query's memory requirements.
Fix: Use a larger warehouse:
-- Temporarily use larger warehouse for this query
USE WAREHOUSE TRANSFORMING_LARGE;
-- Run query
SELECT ...;
-- Switch back
USE WAREHOUSE TRANSFORMING;
Or optimise the query:
- Reduce window function complexity
- Break into smaller CTEs
- Use LIMIT for exploratory queries
Issue 3: Inefficient JOIN
Symptom: Long duration on "Join" operator in Query Profile
Cause: Joining large tables without proper filters
Fix:
-- BAD: Unrestricted join
SELECT *
FROM fct_orders o
LEFT JOIN dim_customers c ON o.customer_id = c.customer_id;
-- GOOD: Filter before joining
SELECT *
FROM fct_orders o
LEFT JOIN dim_customers c ON o.customer_id = c.customer_id
WHERE o.order_date >= '2026-01-01'; -- Reduces fct_orders rows first
Issue 4: SELECT *
Symptom: High bytes_scanned for simple queries
Cause: Selecting all columns when only a few are needed
Fix:
-- BAD: Reads all columns
SELECT * FROM fct_orders WHERE order_id = 12345;
-- GOOD: Only read needed columns
SELECT order_id, order_total, order_date
FROM fct_orders
WHERE order_id = 12345;
Snowflake stores data in columnar format, so reading fewer columns = less data scanned.
Warehouse Sizing
Snowflake warehouses come in sizes from X-Small to 6X-Large. Choosing the right size balances cost and performance.
Warehouse Sizes and Costs
| Size | Credits/Hour | Relative Cost | Use Case |
|---|---|---|---|
| X-Small | 1 | 1x | Development, small queries |
| Small | 2 | 2x | Light production workloads |
| Medium | 4 | 4x | Standard production (dbt, BI queries) |
| Large | 8 | 8x | Heavy transformations, complex queries |
| X-Large | 16 | 16x | Data science, ML workloads |
| 2X-Large | 32 | 32x | Rare, very large batch jobs |
Pricing: ~$2.50 per credit (varies by region and Snowflake edition)
Example monthly costs (assuming 8 hours/day, 22 days/month): - X-Small: 1 credit/hr × 8 hrs/day × 22 days = 176 credits/month = ~$440/month - Medium: 4 credits/hr × 8 hrs/day × 22 days = 704 credits/month = ~$1,760/month
Monitor Warehouse Utilisation
Query warehouse credit usage:
SELECT
warehouse_name,
SUM(credits_used) AS total_credits,
SUM(credits_used) * 2.5 AS estimated_cost_usd,
COUNT(DISTINCT DATE(start_time)) AS days_active,
SUM(credits_used) / days_active AS avg_credits_per_day
FROM snowflake.account_usage.warehouse_metering_history
WHERE start_time >= DATEADD(month, -1, CURRENT_TIMESTAMP())
GROUP BY warehouse_name
ORDER BY total_credits DESC;
Example output:
| warehouse_name | total_credits | estimated_cost_usd | days_active | avg_credits_per_day |
|---|---|---|---|---|
| TRANSFORMING | 125.3 | $313.25 | 30 | 4.18 |
| ANALYTICS_WH | 45.2 | $113.00 | 28 | 1.61 |
| LOADING | 12.5 | $31.25 | 30 | 0.42 |
Right-Sizing Strategies
Strategy 1: Monitor Query Queueing
If queries are queueing (waiting for warehouse capacity):
-- Check for queueing
SELECT
warehouse_name,
COUNT(*) AS queued_queries,
AVG(queued_overload_time) / 1000 AS avg_queue_seconds
FROM snowflake.account_usage.query_history
WHERE start_time >= DATEADD(day, -7, CURRENT_TIMESTAMP())
AND queued_overload_time > 0
GROUP BY warehouse_name
ORDER BY queued_queries DESC;
If queueing is frequent: Increase warehouse size or add multi-cluster scaling.
Strategy 2: Monitor Warehouse Load
Check average query concurrency:
WITH hourly_concurrency AS (
SELECT
warehouse_name,
DATE_TRUNC('hour', start_time) AS hour,
COUNT(*) AS queries_in_hour
FROM snowflake.account_usage.query_history
WHERE start_time >= DATEADD(day, -7, CURRENT_TIMESTAMP())
AND execution_status = 'SUCCESS'
GROUP BY warehouse_name, hour
)
SELECT
warehouse_name,
AVG(queries_in_hour) AS avg_queries_per_hour,
MAX(queries_in_hour) AS peak_queries_per_hour
FROM hourly_concurrency
GROUP BY warehouse_name;
If peak concurrency is high: Enable multi-cluster scaling.
Strategy 3: Test Warehouse Sizes
Run the same query on different warehouse sizes and compare:
-- Run on X-Small
USE WAREHOUSE TRANSFORMING_XSMALL;
SELECT /* query */ FROM fct_orders WHERE ...;
-- Note duration and cost
-- Run on Small
USE WAREHOUSE TRANSFORMING_SMALL;
SELECT /* same query */ FROM fct_orders WHERE ...;
-- Compare duration and cost
Rule of thumb: - Doubling warehouse size = ~2x cost, ~1.5x faster (not linear due to overhead) - If query finishes in 5 minutes on Small and 10 minutes on X-Small, Small is more cost-effective
Multi-Cluster Warehouses
For workloads with variable concurrency (e.g., BI dashboards with many users):
ALTER WAREHOUSE ANALYTICS_WH SET
MIN_CLUSTER_COUNT = 1
MAX_CLUSTER_COUNT = 3
SCALING_POLICY = 'STANDARD'; -- or 'ECONOMY'
Scaling policies: - STANDARD — Starts additional clusters immediately when queue depth increases (favour performance) - ECONOMY — Waits longer before starting clusters (favour cost)
Use case: Lightdash dashboards during business hours (8am-6pm) may spike to 20 concurrent queries. Multi-cluster warehouse scales from 1 to 3 clusters automatically.
Resource Monitors
Resource Monitors prevent budget overruns by setting credit limits and alerting when thresholds are crossed.
Create a Resource Monitor
USE ROLE ACCOUNTADMIN;
CREATE RESOURCE MONITOR monthly_budget
WITH CREDIT_QUOTA = 500 -- 500 credits per month (~$1,250)
FREQUENCY = MONTHLY
START_TIMESTAMP = IMMEDIATELY
TRIGGERS
ON 75 PERCENT DO NOTIFY -- Alert at 75% usage
ON 90 PERCENT DO NOTIFY -- Alert at 90% usage
ON 100 PERCENT DO SUSPEND -- Suspend warehouses at 100%
ON 110 PERCENT DO SUSPEND_IMMEDIATE; -- Immediately suspend (kill running queries)
Trigger actions: - NOTIFY — Send email alert to account admins - SUSPEND — Prevent new queries, allow running queries to finish - SUSPEND_IMMEDIATE — Kill all running queries and suspend
Assign Monitor to Warehouses
-- Apply to specific warehouse
ALTER WAREHOUSE TRANSFORMING
SET RESOURCE_MONITOR = monthly_budget;
-- Apply to all warehouses (account-level)
ALTER ACCOUNT
SET RESOURCE_MONITOR = monthly_budget;
View Resource Monitor Status
SELECT
name AS monitor_name,
credit_quota,
used_credits,
remaining_credits,
(used_credits / credit_quota) * 100 AS percent_used
FROM snowflake.account_usage.resource_monitors
WHERE name = 'monthly_budget';
Example output:
| monitor_name | credit_quota | used_credits | remaining_credits | percent_used |
|---|---|---|---|---|
| monthly_budget | 500 | 387.5 | 112.5 | 77.5% |
Configure Notifications
By default, Snowflake sends emails to ACCOUNTADMIN users. To send to Slack:
Option A: Email-to-Slack Forwarding
- Create an email forwarding rule in Gmail/Outlook
- Forward Snowflake alerts to Slack email integration
- Example:
data-alerts@your-workspace.slack.com
Option B: Snowflake + Lambda + Slack
- Create Snowflake notification integration (sends webhook to Lambda)
- Lambda function formats message and sends to Slack
Lambda function:
import json
import urllib3
def lambda_handler(event, context):
# Parse Snowflake notification
notification = json.loads(event['body'])
# Format Slack message
message = {
"text": f"🚨 Snowflake Resource Monitor Alert",
"blocks": [
{
"type": "section",
"text": {
"type": "mrkdwn",
"text": f"*Monitor:* {notification['monitor_name']}\n*Usage:* {notification['percent_used']}%\n*Threshold:* {notification['threshold']}%"
}
}
]
}
# Send to Slack
http = urllib3.PoolManager()
response = http.request(
'POST',
'https://hooks.slack.com/services/YOUR/WEBHOOK/URL',
body=json.dumps(message),
headers={'Content-Type': 'application/json'}
)
return {'statusCode': 200}
Best Practices for Resource Monitors
- Set account-level monitor — prevents runaway costs across all warehouses
- Use tiered alerts — 75% (warning), 90% (urgent), 100% (suspend)
- Don't use SUSPEND_IMMEDIATE lightly — kills running queries, may corrupt incremental loads
- Set realistic budgets — review last 3 months' usage to set appropriate quota
- Monitor monthly trends — adjust quota if usage consistently grows
Credit Consumption by Warehouse
Track which warehouses consume the most credits:
SELECT
warehouse_name,
DATE(start_time) AS usage_date,
SUM(credits_used) AS daily_credits,
SUM(credits_used) * 2.5 AS daily_cost_usd
FROM snowflake.account_usage.warehouse_metering_history
WHERE start_time >= DATEADD(day, -30, CURRENT_TIMESTAMP())
GROUP BY warehouse_name, usage_date
ORDER BY warehouse_name, usage_date;
Visualise in Snowsight:
- Run the query
- Click Chart tab
- Select Line chart
- X-axis:
usage_date - Y-axis:
daily_credits - Group by:
warehouse_name
Example chart:
Credits/Day
20 │ ╱──TRANSFORMING
│ ╱───
15 │ ╱───
│ ╱───
10 │ ╱───
│ ANALYTICS_WH───────────────────────────────
5 │────LOADING───────────────────────────────────
└────────────────────────────────────────────────
Jan 1 Jan 15 Feb 1 Today
Insight: TRANSFORMING warehouse credit usage spiking in February → investigate if due to data volume growth or inefficient queries.
Credit Consumption by User
Attribute costs to users or teams:
SELECT
user_name,
warehouse_name,
SUM(credits_used_cloud_services) AS cloud_services_credits,
COUNT(*) AS query_count,
AVG(total_elapsed_time) / 1000 AS avg_duration_seconds
FROM snowflake.account_usage.query_history
WHERE start_time >= DATEADD(day, -30, CURRENT_TIMESTAMP())
AND execution_status = 'SUCCESS'
GROUP BY user_name, warehouse_name
ORDER BY cloud_services_credits DESC
LIMIT 20;
Example output:
| user_name | warehouse_name | cloud_services_credits | query_count | avg_duration_seconds |
|---|---|---|---|---|
| SVC_DBT | TRANSFORMING | 15.3 | 450 | 12.5 |
| ALICE | ANALYTICS_WH | 8.7 | 1250 | 3.2 |
| BOB | ANALYTICS_WH | 5.2 | 890 | 2.8 |
Chargeback strategy: Allocate costs to teams based on their warehouse usage.
Query Performance Metrics
Key Metrics to Track
- P95 query duration — 95th percentile query time (ignores outliers)
- Queries per hour — workload intensity
- Average bytes scanned per query — data efficiency
- Cache hit rate — percentage of queries served from cache
P95 Query Duration
SELECT
warehouse_name,
APPROX_PERCENTILE(total_elapsed_time / 1000, 0.95) AS p95_duration_seconds
FROM snowflake.account_usage.query_history
WHERE start_time >= DATEADD(day, -7, CURRENT_TIMESTAMP())
AND execution_status = 'SUCCESS'
GROUP BY warehouse_name;
Target SLO: P95 duration < 10 seconds for BI queries, < 30 minutes for dbt runs.
Cache Hit Rate
Snowflake caches query results. Repeated identical queries return instantly from cache.
SELECT
COUNT(CASE WHEN query_result_used_from_cache = TRUE THEN 1 END) AS cached_queries,
COUNT(*) AS total_queries,
(cached_queries * 100.0 / total_queries) AS cache_hit_rate_pct
FROM snowflake.account_usage.query_history
WHERE start_time >= DATEADD(day, -7, CURRENT_TIMESTAMP());
High cache hit rate (>30%): Good — users are running similar queries (dashboards, reports) Low cache hit rate (<10%): Queries are unique (exploratory analysis, one-off queries)
Summary
You've set up Snowflake monitoring:
- Query profiling — identify slow queries and optimise using Query Profile
- Warehouse sizing — right-size warehouses based on usage and concurrency
- Resource Monitors — prevent budget overruns with credit quotas and alerts
- Credit attribution — track costs by warehouse, user, and team
- Performance metrics — monitor P95 duration, cache hit rate, and query trends
Snowflake's built-in monitoring provides comprehensive visibility into compute usage and costs.
What's Next
Implement comprehensive cost monitoring across your entire data stack.
Continue to Cost Monitoring →