Technical
10 min read

Agentic Salesforce to Snowflake ELT: From One Prompt to a Governed Pipeline

How Bruin CLI, Bruin MCP, Bruin Cloud, and agent skills can build and maintain a Salesforce to Snowflake ELT pipeline across bronze, silver, and gold layers.

Arsalan Noorafkan

Developer Advocate

Quick answer: an agent can build a Salesforce to Snowflake ELT pipeline when the pipeline is code, the source and warehouse are connected, and the agent can validate its own work. In Bruin, that means Salesforce and Snowflake connections in Bruin Cloud, ingestr assets for bronze ingestion, SQL models for silver and gold, checks and lineage in the repo, Bruin MCP for agent context, and skills for safe activation and self-healing.

This is the part that changed for me: the interesting bit is not "can an AI write SQL?"

Of course it can.

The harder question is whether it can build the whole data engineering workflow around the SQL: source discovery, ingestion, schema drift handling, model layering, checks, run history, Slack alerts, documentation, and the annoying operational work that starts after the first successful sync.

That is where this Salesforce to Snowflake demo is useful. The demo starts with an empty repo and a single prompt. The Bruin agent maps Salesforce objects, creates ingestr assets, loads them into Snowflake, builds silver and gold models, runs the pipeline, updates the README, and leaves behind a proper lineage graph.

The example Salesforce org uses credit union-style demo data. That is just the dataset. The workflow applies to any Salesforce org with accounts, opportunities, contacts, tasks, leads, users, campaigns, and custom objects.

The pipeline the agent built

The architecture is the standard one most teams want, but rarely get in one clean repo:

Salesforce -> Snowflake bronze -> silver models -> gold marts -> dashboards, Slack, activation, agents

In the demo, the agent created:

  • 14 bronze ingestion assets
  • 5 silver models
  • 9 gold reporting assets
  • daily scheduling
  • failed-run Slack notifications
  • README documentation
  • lineage from Salesforce source objects to reporting tables

The bronze layer uses Bruin ingestr assets. Each asset maps a Salesforce object into Snowflake and uses an incremental merge strategy, usually keyed by SystemModstamp.

The silver layer cleans and joins the raw Salesforce objects. That is where objects such as Opportunity, Account, User, Task, and product or campaign tables become usable business entities.

The gold layer is where the pipeline becomes useful to humans: pipeline KPIs, pipeline by stage, activity coverage, channel performance, product performance, account health, and other report-ready tables.

Why this works better than "ask ChatGPT for a connector"

Most generated pipeline code fails because the agent is operating outside the system it is supposed to change.

It does not know your connection names. It does not know your asset format. It does not see the lineage. It cannot validate the graph. It cannot inspect the failed run. It cannot tell whether a Salesforce field appeared in bronze but failed to propagate downstream.

Bruin gives the agent the rails:

  • Bruin CLI validates and runs the project locally.
  • Bruin ingestr assets define ingestion as files.
  • Bruin checks define data quality near the asset.
  • Bruin Cloud stores connections, schedules runs, sends Slack alerts, and records run history.
  • Bruin MCP gives the agent access to Bruin docs and project context.
  • GitHub gives the agent a reviewable change path.
  • Skills tell the agent how to handle specialized workflows.

So the agent is not just generating text. It is changing a real repo, validating the change, and using the same operational system the data team uses.

The one-shot prompt

The demo prompt is almost painfully short:

Use the connections to Salesforce and Snowflake.
Create the bronze layer ingestion assets to ingest all the objects from Salesforce to Snowflake.
The pipeline should run daily and ingest new data only.
A one-time backfill should be completed for each asset to load historical data.
Once the bronze layer is completed, build the silver layer data models.
Once the silver layer is completed, build the gold layer business reports.
Send pipeline notification to Slack channel "#credit-union-demo" but only failed runs.

That kind of prompt only works if the agent can do real discovery and execution. It needs to inspect Salesforce, generate Bruin assets, run validation, fix mistakes, run the pipeline, and document what changed.

The more interesting lesson is not that the prompt was short. It is that the environment was ready for the prompt:

  • Salesforce and Snowflake connections existed in Bruin Cloud. See Manage Connections. The names can follow your team's convention, but the names used in Bruin Cloud, asset YAML files, pipeline config, and agent instructions need to match.
  • The repo had agent instructions.
  • Bruin CLI and MCP were available. See Set Up the Bruin MCP.
  • Bruin Cloud could run and observe the pipeline. See Create a Project and Enable a Pipeline.
  • The agent had permission to work in code.

Without that setup, the prompt would be a wish.

What the bronze layer looks like

A Salesforce bronze asset in Bruin is just YAML. The connection names in this example are placeholders, so replace them with the names your Bruin Cloud project uses:

name: bronze.salesforce_opportunities
type: ingestr
description: Ingests Salesforce Opportunity records into Snowflake.
connection: snowflake-default

parameters:
  destination: snowflake
  source_connection: salesforce
  source_table: opportunity
  incremental_key: SystemModstamp
  incremental_strategy: merge
  schema_contract: evolve
  schema_naming: snake_case

columns:
  - name: id
    type: VARCHAR
    primary_key: true
    checks:
      - name: not_null
      - name: unique
  - name: amount
    type: DOUBLE
    checks:
      - name: non_negative
  - name: system_modstamp
    type: TIMESTAMP
    checks:
      - name: not_null

This is agent-friendly because it is explicit. The source object, incremental strategy, destination connection, schema contract, columns, and checks are all visible in the repo.

It is also human-friendly for the same reason.

The part people usually skip: skills

After the pipeline was created, the demo added two skills.

Skills are repo-local instructions for agent workflows. They are not magic. They are closer to "how our team wants this agent to behave when doing this class of work."

That distinction matters.

Skill 1: Salesforce data activation and admin

The first skill handles Salesforce activation and admin work from a Bruin Slack agent.

It tells the agent how to:

  • scope a Salesforce write request
  • identify object API names and field API names
  • select records from warehouse output, Slack-provided IDs, CSVs, or Salesforce SOQL
  • run a dry-run first
  • use the Bruin Salesforce connection without exposing secrets
  • update, upsert, create, or prepare Salesforce changes
  • stop on unexpectedly broad or risky operations
  • report counts, failures, skipped records, and next actions back to Slack

The use cases are practical:

  • Create follow-up tasks for neglected high-value opportunities.
  • Update account risk fields from a Snowflake gold table.
  • Add a new Opportunity field and verify it flows into bronze.
  • Prepare a write-back set for approval before touching Salesforce.

This is where "agentic data engineering" starts to look different from analytics. The agent is not only answering a question. It is moving a governed output back into an operational system, with approval gates.

Skill 2: self-healing pipeline work

The second skill handles pipeline failures.

It tells the agent how to:

  • inspect Bruin Cloud failed runs
  • read failed logs
  • classify failures
  • inspect asset files and lineage
  • detect Salesforce schema drift
  • identify Snowflake SQL or cast errors
  • validate local fixes
  • open a PR or ask for approval
  • rerun the narrowest affected asset set

The demo covers the kind of failures that happen in real Salesforce pipelines:

  • A new Salesforce field lands in bronze but does not appear downstream.
  • A source value changes format and breaks a cast.
  • A metric definition or label is wrong.
  • A failed run needs diagnosis and a scoped rerun.

Self-healing should not mean "the agent silently changes production." A better default is:

detect -> diagnose -> propose -> validate -> PR or approval -> scoped rerun -> verify

That is boring in the right way.

Why this matters for Salesforce data

Salesforce pipelines are easy to underestimate.

At first, it looks like you just need to copy Account and Opportunity into Snowflake. Then someone adds a custom field. Then RevOps changes a stage. Then Finance wants a different definition of weighted pipeline. Then Customer Success wants risk scores written back. Then a dashboard is wrong because a new field exists in Salesforce but not in the silver model.

The connector is only the first 20 percent.

The real work is maintaining the business model around Salesforce:

  • What counts as open pipeline?
  • Which owner field should reports use?
  • How should deleted or merged records behave?
  • Which custom fields are activation-safe?
  • Which gold table is trusted for Slack answers?
  • What should happen when Salesforce schema drifts?

This is why putting the pipeline in code matters. The agent can only help with the real workflow if the real workflow is visible.

Bruin MCP in the loop

Bruin MCP is the bridge between the agent and the Bruin project.

With MCP configured, the agent can understand Bruin concepts and interact with the project through the right commands. It can inspect asset files, understand how ingestr assets work, validate the graph, and follow Bruin patterns instead of inventing them.

That matters when you ask it to do something like:

The new Salesforce Opportunity field landed in bronze. Propagate it through silver and gold where it belongs, update descriptions and checks, validate the pipeline, and open a PR.

Without MCP and repo context, that is a vague prompt. With Bruin, it becomes a constrained engineering task.

A practical rollout path

I would not start production with "ingest every Salesforce object" unless you have a very clear reason.

A safer version:

  1. Connect Salesforce and Snowflake in Bruin Cloud with Manage Connections.
  2. Initialize the Bruin project repo after reviewing Projects and Initialize from a Template.
  3. Add AGENTS.md with the same connection names used in the repo, plus pipeline rules and safety rules.
  4. Connect the repo to Bruin Cloud with Create a Project.
  5. Enable the pipeline and configure failed-run notifications with Enable a Pipeline.
  6. Create the Bruin Cloud agent with Configure AI Agents, then connect Slack with Deploy an AI Analyst to Bruin Cloud & Slack.
  7. Ingest Account, User, Opportunity, Contact, Lead, and Task.
  8. Build one silver opportunity pipeline model.
  9. Build one gold KPI model.
  10. Add checks and descriptions.
  11. Schedule daily incremental runs.
  12. Add Bruin Cloud MCP if the agent should inspect Cloud runs and trigger scoped reruns.
  13. Add Salesforce activation only after the gold layer is trusted.

The one-shot demo shows what is possible. The incremental path is how I would roll it out when the data affects revenue reporting.

Watch the demo and follow the technical guide

The video demo walks through the prompt-to-pipeline flow, the generated lineage, the Slack agent examples, Salesforce activation, schema drift, and self-healing scenarios:

The demo repository is here: arsalann/bruin-demo-credit-union.

For the detailed setup steps, use the academy guide: Build a Salesforce to Snowflake ELT Pipeline with Bruin.

FAQ

Can an AI agent build a Salesforce to Snowflake ELT pipeline?

Yes, if it has tools and context. The agent needs Salesforce and Snowflake connections, Bruin CLI, Bruin MCP, access to the repo, validation output, run history, and permission to create reviewed code changes.

Use bronze for raw Salesforce objects, silver for cleaned and joined business entities, and gold for reporting-ready or activation-ready marts. Keep checks and descriptions close to each asset.

What does Bruin MCP do here?

Bruin MCP gives the agent Bruin-specific context and tool access. It helps the agent inspect project files, understand asset formats, use Bruin commands, and avoid guessing how the pipeline should be structured.

What are skills used for?

Skills tell the agent how to run specialized workflows. In this demo, one skill handles safe Salesforce activation and admin work. Another handles self-healing pipeline investigation, diagnosis, fixes, validation, PRs, and scoped reruns.

Is this only for credit union data?

No. The credit union data is a demo dataset. The same architecture works for any Salesforce data: RevOps, sales, customer success, support, marketing operations, finance, or custom Salesforce workflows.