How to Build a Salesforce to Snowflake Pipeline with Bruin
A simple guide to ingesting Salesforce data into Snowflake, modelling bronze, silver, and gold layers, and using Bruin agents for dashboards, Slack reports, activation, and self-healing pipelines.
Arsalan Noorafkan
Developer Advocate
Quick answer: a good Salesforce to Snowflake pipeline does three jobs. First, it ingests Salesforce objects like Accounts, Opportunities, Contacts, Leads, Tasks, and custom objects into Snowflake. Second, it models that data into bronze, silver, and gold layers. Third, it gives the business useful surfaces on top: dashboards, Slack reports, alerts, data activation, and agents that can help diagnose pipeline issues.
You can do the first part with plenty of tools. The hard part is usually everything after the sync.
Salesforce data looks simple until it becomes the source of truth for revenue, forecasting, customer health, renewals, territory planning, and sales operations. A new field appears. A stage definition changes. Someone wants churn risk written back to Salesforce. The dashboard is suddenly wrong, and now the "simple connector" is not enough.
This guide explains the common ways to move Salesforce data into Snowflake, then shows the Bruin version: ingestion, modelling, quality checks, scheduled runs, AI dashboards, Slack reports, activation, and a practical self-healing pipeline pattern.
Strong path, but it is a bigger Salesforce Data Cloud decision
Custom API jobs
Very specific extraction or write-back logic
You own retries, schema drift, scheduling, credentials, and monitoring
Bruin
Teams that want ingestion, modelling, checks, lineage, orchestration, MCP, and agents together
Code-first, so it fits teams that want pipelines as files
If all you need is "copy these Salesforce tables every hour", a connector is fine.
If you need a governed CRM data workflow, Bruin becomes more interesting. Bruin is open-source first: the CLI handles ingestion, SQL/Python/R assets, quality checks, materialization, lineage, and local runs. Bruin Cloud adds scheduling, catalog, notifications, dashboards, audit logs, and AI agents. Bruin MCP lets coding agents read Bruin docs and help build or maintain the pipeline from your editor.
Bronze is your recovery layer. Silver is where business logic becomes reusable. Gold is what dashboards, agents, alerts, and activation jobs should use.
In Bruin, you define connections in .bruin.yml. Salesforce can use username, password, and security token auth, or an OAuth access token. Snowflake can use password auth or key-pair auth.
For production, keep secrets outside Git and use separate credentials for pipelines and agents. Most agents should have read-only Snowflake access.
That is enough to show the idea: Salesforce is the source, Snowflake is the destination, merge keeps the table current, and schema_contract: evolve helps with Salesforce field changes.
For a first version, start with:
account
user
opportunity
opportunity_line_item
contact
lead
task
key custom objects using custom:<custom_object_name>
Run one asset while developing, then run the whole pipeline once the dependency graph is ready:
bruin run assets/bronze/salesforce_opportunity.asset.yml
Silver models should answer "what does this mean?" not just "where did this come from?"
For example:
clean Salesforce field names
remove deleted records
standardize stages and owner mappings
join opportunities to accounts and users
normalize custom fields
add quality checks for IDs, amounts, and required fields
Gold models should be closer to business use cases:
Gold table
Used for
gold_revops.pipeline_health
Pipeline by stage, owner, region, close date, and overdue status
gold_revops.account_risk
Customer health, churn signals, renewal risk, and recommended action
gold_revops.forecast_movement
Weekly forecast changes and slipped opportunities
gold_revops.sales_activity
Tasks, events, follow-ups, and activity coverage
This is where Bruin's checks matter. You can declare things like not_null, unique, non_negative, accepted values, and min/max checks near the asset definition. That gives both humans and agents a clear contract for what "good data" means.
During development, use bruin validate and targeted bruin run commands. In production, schedule the pipeline in Bruin Cloud, CI/CD, or your existing scheduler.
The useful part: the same project can run locally, in CI, and in Bruin Cloud. That makes it easier to test changes before touching production.
Bruin Cloud dashboards can be built from a prompt. You can ask for a RevOps dashboard with open pipeline by stage, overdue opportunities, account risk, and forecast movement. The agent runs SQL, creates widgets, and lets you iterate before publishing.
That is useful because your dashboard and Slack answers use the same governed gold tables.
A scheduled agent can send daily or weekly reports to Slack:
Every weekday morning, summarize Salesforce pipeline movement, closed won amount, closed lost amount, overdue opportunities, and any account risk score above 80. If overdue open pipeline increased more than 10 percent week over week, send it as an alert.
This is a better default than sending everyone another dashboard link. People see the exception, not just the chart.
Data activation means using Snowflake outputs to update operational systems like Salesforce.
For example, gold_revops.account_risk could produce:
account ID
risk score
risk reason
recommended next action
whether Salesforce write-back is allowed
The safe pattern is approval first, write-back second. Let the agent prepare and explain the update set, then use a reviewed Python asset or controlled workflow to update only approved Salesforce fields. Log every write.
Self-healing should not mean "an agent silently changes production." A better pattern is:
Detect -> diagnose -> propose -> test in dev -> open a PR or ask for approval.
Example: Salesforce adds a new Opportunity field. Bronze ingestion lands it. A downstream model or check fails. A Bruin-aware agent can inspect the failed run, schema change, lineage, and asset definition, then propose the smallest safe update. If the change touches finance metrics or Salesforce write-back fields, it should ask for approval.
This is where Bruin MCP helps. The agent can read Bruin docs, understand the asset format, inspect the pipeline files, and run the CLI. The pipeline is not hidden in a UI the agent cannot reason about.
If you only need replication, managed ELT tools like Fivetran, Airbyte Cloud, Hevo, Stitch, Rivery, and Matillion are common options. If you also need modelling, checks, lineage, orchestration, MCP, and AI agents, Bruin is a more complete path.
Yes. Bruin supports Salesforce as an ingestr source and Snowflake as a destination. You configure both connections, define ingestr assets for Salesforce objects, and run them with Bruin.
For sales analytics, start with account, user, opportunity, opportunity_line_item, contact, lead, and task. Add campaigns, events, and custom objects when the use case needs them.
Bronze stores raw Salesforce objects in Snowflake. Silver cleans and joins them into business entities. Gold publishes reporting-ready or activation-ready tables, such as pipeline health, account risk, forecast coverage, and renewal risk.
Yes, but it should be controlled. The safer pattern is to generate an approved activation table in Snowflake, validate it, then use a reviewed workflow to call the Salesforce API. Agents can prepare and explain the update, but production writes should have allowlists, approvals, and audit logs.
In practice, it means an agent can inspect failed Bruin runs, schema changes, lineage, and checks, then propose the smallest safe fix. For low-risk changes it can open a PR. For sensitive models or Salesforce write-back fields, it should ask for approval.