Reverse ETL
On this page, you will:
- Understand what reverse ETL is and when to use it
- Configure Snowflake as an Airbyte source
- Sync enriched data from Snowflake back to HubSpot
- Consider operational patterns and best practices
Overview
Reverse ETL is the process of syncing data from your data warehouse back to SaaS tools. This closes the loop: raw data flows into Snowflake, gets transformed by dbt, and the enriched results are pushed back to operational tools like HubSpot.
┌─────────────────────────────────────────────────────────────────────────────┐
│ REVERSE ETL FLOW │
├─────────────────────────────────────────────────────────────────────────────┤
│ │
│ Forward ETL (data in) Reverse ETL (data out) │
│ ───────────────────── ────────────────────── │
│ │
│ ┌─────────────┐ ┌───────────┐ ┌─────────────┐ │
│ │ HubSpot │──────▶│ Snowflake │──────▶│ HubSpot │ │
│ │ (raw CRM) │ │ │ │ (enriched) │ │
│ └─────────────┘ │ dbt model │ └─────────────┘ │
│ │ transforms│ │
│ ┌─────────────┐ │ and │ ┌─────────────┐ │
│ │ Stripe │──────▶│ enriches │──────▶│ Slack │ │
│ │ (payments) │ │ the data │ │ (alerts) │ │
│ └─────────────┘ └───────────┘ └─────────────┘ │
│ │
└─────────────────────────────────────────────────────────────────────────────┘
When to Use Reverse ETL
| Use Case | Example |
|---|---|
| Lead scoring | Calculate a lead score in dbt, push it to HubSpot as a contact property |
| Customer segmentation | Segment customers by behaviour, update HubSpot lists |
| Lifetime value | Calculate LTV in Snowflake, sync to CRM for sales prioritisation |
| Product usage | Aggregate product metrics, push to CRM for customer success |
| Alerting | Identify churn risk, notify via Slack or CRM task |
When NOT to Use Reverse ETL
- Real-time requirements: Reverse ETL is batch-oriented (hourly or daily). For real-time, use event streams or webhooks.
- Simple field mapping: If you just need to copy a field from one SaaS tool to another, a direct Zapier/Make integration may be simpler.
- No transformation needed: If the data doesn't need enrichment, reverse ETL adds unnecessary complexity.
Architecture
Airbyte supports Snowflake as a source, enabling reverse ETL through the same platform used for forward ETL.
┌─────────────────────────────────────────────────────────────────────────────┐
│ AIRBYTE REVERSE ETL │
├─────────────────────────────────────────────────────────────────────────────┤
│ │
│ Source Connection Destination │
│ ────── ────────── ─────────── │
│ │
│ ┌────────────────-─┐ ┌───────────────────┐ ┌─────────────────┐ │
│ │ Snowflake │ │ retl-contacts- │ │ HubSpot │ │
│ │ Source │────▶│ to-hubspot │────▶│ Destination │ │
│ │ │ │ │ │ │ │
│ │ View/Table: │ │ Incremental │ │ Update contacts │ │
│ │ MARTS.CRM. │ │ Append + Dedup │ │ via API │ │
│ │ CONTACTS_ENRICHED│ └───────────────────┘ └─────────────────┘ │
│ └─────────────────-┘ │
│ │
└─────────────────────────────────────────────────────────────────────────────┘
Prepare the Source View
Create a view in Snowflake that contains some example enriched data you want to sync back to HubSpot.
Create the CRM Schema via Terraform
The ANALYTICS.CRM schema should be managed as infrastructure. Add it to your Snowflake Terraform configuration:
# terraform/snowflake/schemas.tf
module "schema_analytics_crm" {
source = "./modules/snowflake_schema"
providers = {
snowflake.sys_admin = snowflake.sys_admin
snowflake.security_admin = snowflake.security_admin
}
schema_name = "CRM"
schema_database_name = "ANALYTICS"
schema_comment = "CRM-domain views and models for analytics and reverse ETL."
}
Deploy via CI/CD, then create the view.
dbt Models
In practice, this view would be a dbt model in your transformation layer. The data transformation section (future) covers building these models. For reverse ETL testing, a manual view works.
Example: Enriched Contacts View
CREATE OR REPLACE VIEW ANALYTICS.CRM.CONTACTS_ENRICHED AS
SELECT
id AS hubspot_contact_id
, email
, firstname
, lastname
-- Example enriched fields to push back to HubSpot
, UNIFORM(1, 10, RANDOM()) AS total_orders
, UNIFORM(1, 10000::number(10, 2), RANDOM()) AS lifetime_value_usd
, ARRAY_CONSTRUCT('low','medium','high')[UNIFORM(0, 3, RANDOM())] AS customer_tier
CURRENT_TIMESTAMP() AS _synced_at
FROM AIRBYTE.HUBSPOT.CONTACTS
Grant Read Access
The reverse ETL source needs read access to the view. If using the same SVC_AIRBYTE account:
-- Grant usage on the schema containing the view
GRANT USAGE ON DATABASE ANALYTICS TO ROLE SVC_AIRBYTE;
GRANT USAGE ON SCHEMA ANALYTICS.CRM TO ROLE SVC_AIRBYTE;
GRANT SELECT ON VIEW ANALYTICS.CRM.CONTACTS_ENRICHED TO ROLE SVC_AIRBYTE;
Least Privilege
Only grant access to the specific views needed for reverse ETL. Do not grant broad read access to the analytics database.
Configure Snowflake as a Source
Via Airbyte UI
- Navigate to Sources > New Source
- Search for Snowflake and select it
- Configure:
| Field | Value |
|---|---|
| Source name | snowflake-reverse-etl |
| Host | xxx.eu-west-2.snowflakecomputing.com |
| Role | SVC_AIRBYTE |
| Warehouse | LOADING |
| Database | ANALYTICS |
| Schema | CRM |
| Authentication | Key Pair |
| Username | SVC_AIRBYTE |
| Private Key | Same key as the destination |
- Click Set up source
Configure HubSpot as a Destination
Via Airbyte UI
- Navigate to Destinations > New Destination
- Search for HubSpot and select it
- Configure:
| Field | Value |
|---|---|
| Destination name | hubspot-reverse-etl |
| Authentication | Private App |
| Access Token | Your private app token (needs write scope) |
Write Scope Required
For reverse ETL, the HubSpot private app needs write scopes:
crm.objects.contacts.write
This is a separate scope from the read scope used for forward ETL. Consider using a separate private app for reverse ETL to maintain least-privilege access.
- Click Set up destination
Create the Reverse ETL Connection
Via Airbyte UI
- Navigate to Connections > New Connection
- Select
snowflake-reverse-etlas the source - Select
hubspot-reverse-etlas the destination - Configure:
| Setting | Value |
|---|---|
| Connection name | retl-contacts-to-hubspot |
| Schedule | Manual (triggered by Prefect) |
- Select streams: Enable
CONTACTS_ENRICHED
| Stream | Sync Mode | Primary Key |
|---|---|---|
CONTACTS_ENRICHED |
Incremental | Append + Dedup | hubspot_contact_id |
- Map fields: Configure how Snowflake columns map to HubSpot properties
| Snowflake Column | HubSpot Property |
|---|---|
hubspot_contact_id |
Contact ID (lookup key) |
total_orders |
total_orders (custom property) |
lifetime_value_usd |
lifetime_value (custom property) |
customer_tier |
customer_tier (custom property) |
- Click Set up connection
Create Custom Properties in HubSpot
Before the first sync, create the custom contact properties in HubSpot:
- Go to Settings > Properties > Contact Properties
- Click Create property for each:
| Property | Type | Group |
|---|---|---|
total_orders |
Number | Contact information |
lifetime_value |
Number (currency) | Contact information |
customer_tier |
Single-line text | Contact information |
Run and Verify
Trigger the Sync
# Via Airbyte API
curl -X POST "https://api.airbyte.com/v1/jobs" \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"connectionId": "YOUR_RETL_CONNECTION_ID",
"jobType": "sync"
}'
Verify in HubSpot
- Open a contact in HubSpot
- Check the custom properties section
- Verify
total_orders,lifetime_value, andcustomer_tierare populated
Best Practices
1. Use Views, Not Tables
Source reverse ETL from views or dbt models, not raw tables. Views ensure:
- Data is always current (computed at query time)
- Transformations are applied consistently
- You can change the logic without reconfiguring Airbyte
2. Include a Sync Timestamp
Add a _synced_at or _updated_at column to your source view. This helps with debugging and incremental sync tracking.
3. Limit Record Volume
Only sync records that have changed. Use incremental sync mode and include a cursor field (e.g., _updated_at) to avoid re-syncing unchanged records.
4. Separate Read and Write Scopes
Use separate HubSpot private apps for forward ETL (read-only) and reverse ETL (write). This follows the principle of least privilege.
5. Test in a Sandbox
HubSpot offers sandbox accounts for testing. Configure a separate Airbyte connection pointing to the sandbox before running reverse ETL in production.
Sync Frequency Considerations
| Frequency | Use Case | HubSpot API Impact |
|---|---|---|
| Daily | Lead scoring, customer segmentation | Low |
| Hourly | Time-sensitive alerts | Moderate |
| Real-time | Not supported by Airbyte | Use webhooks instead |
For most use cases, a daily sync is sufficient and stays well within HubSpot's API rate limits.
Summary
You've set up reverse ETL:
- Understood when reverse ETL is valuable
- Configured Snowflake as an Airbyte source
- Set up HubSpot as a destination with write access
- Created a connection to sync enriched contact data
- Learned best practices for reverse ETL
What's Next
With forward and reverse ETL configured, orchestrate everything with Prefect.
Continue to Prefect Orchestration →