
Agentic Salesforce to Snowflake ELT: From One Prompt to a Governed Pipeline
How Bruin CLI, Bruin MCP, Bruin Cloud, and agent skills can build and maintain a Salesforce to Snowflake ELT pipeline across bronze, silver, and gold layers.
A practical comparison of Pentaho and Bruin for teams evaluating PDI, Kettle, and legacy ETL alternatives. Bruin offers onboarding and migration planning for governed pipelines, DAC dashboards, MCP, and AI analytics.

Arsalan Noorafkan
Developer Advocate

Quick answer: This page is not making a claim about Pentaho's business status. It is a general alternatives page for teams asking whether their Pentaho Data Integration, Kettle, or older ETL setup still fits how data teams now build, review, govern, and serve data. If you want open-source-first pipelines as code, quality checks, lineage, Git review, hybrid deployment, DAC dashboards, MCP access, and an AI data analyst on top, Bruin is a cleaner replacement path. The Bruin team can also help with onboarding and migration planning.
This matters because most Pentaho estates are not one product. They are a pile of PDI transformations, scheduled jobs, local Spoon workflows, custom scripts, server configuration, old reports, and tribal knowledge. The migration is not "replace a tool". It is "make the pipeline understandable again".
That is where Bruin is different.
Pentaho has been around for a long time, and there is a reason people used it. PDI made ETL approachable. You could drag steps onto a canvas, connect them, run the job, and hand it to someone who did not want to write much SQL or Python.
For many teams, that was a proper unlock:
If your workflows are stable and the team maintaining them is happy, you do not need a migration because a blog post says so.
But if you are here, it is probably because the old setup is starting to cost you.
The first problem is reviewability. Large visual transformations are easy to start and painful to govern. You can version files, sure, but reviewing a visual ETL diff is not the same thing as reviewing a SQL model, Python asset, or YAML config in Git.
The second problem is support and runtime drift. Pentaho's own lifecycle page says older versions outside the listed lifecycle are unsupported, and it specifically notes that Pentaho 9.3 is unsupported from July 1, 2026. If you are sitting on a long-lived PDI estate, that date is not trivia. It is a planning problem.
The third problem is AI readiness. An AI analyst only works if the data underneath is trustworthy. It needs asset ownership, freshness checks, lineage, metric definitions, access control, and auditability. A pile of ETL jobs can produce tables, but it usually does not produce enough context for governed AI analytics.
Bruin is an open-source-first data platform. Locally, teams use Bruin CLI and ingestr to build and run pipelines. In production, Bruin Cloud adds orchestration, scheduling, observability, catalog, lineage, SSO, RBAC, audit logs, cost visibility, and the AI data analyst.
The important part: ingestion, transformation, checks, and metadata live together.
Here is the kind of asset definition you end up with.
Asset 1:
name: raw.salesforce_opportunity
type: ingestr
parameters:
source_connection: salesforce
source_table: opportunity
destination: snowflake
incremental_strategy: merge
incremental_key: last_timestamp
columns:
- name: id
type: string
description: "Primary key"
primary_key: true
checks:
- name: unique
- name: not_null
- name: amount
type: float
- name: close_date
type: timestamp
Asset 2:
/* @bruin
name: marts.revenue_pipeline
type: sf.sql
depends:
- raw.salesforce_opportunity
owner: revenue-analytics
materialization:
type: table
meta:
tier: gold
migrated_from: pentaho
columns:
- name: opportunity_id
type: string
checks:
- name: unique
- name: not_null
- name: account_id
type: string
checks:
- name: not_null
- name: amount
type: float
checks:
- name: non_negative
- name: close_date
type: timestamp
checks:
- name: not_null
@bruin */
SELECT
id AS opportunity_id,
account_id,
stage_name,
amount,
close_date
FROM raw.salesforce_opportunity
WHERE is_deleted = false
And when you add built-in checks, they live in column metadata:
columns:
- name: id
type: integer
description: "Primary key"
checks:
- name: unique
- name: not_null
That is a lot less mystical than a big visual job. The source is clear, the dependency is clear, the owner is clear, and the checks are visible instead of buried in a side process.
| Dimension | Pentaho | Bruin |
|---|---|---|
| Main workflow | Visual ETL jobs and transformations | Code-first assets for ingestion, SQL, Python, checks, and metadata |
| Development | Desktop designer and server projects | Local CLI, VS Code, Git, CI |
| Transformations | Visual steps and job files | SQL and Python as first-class assets |
| Ingestion | Mature ETL components | ingestr sources plus Python materializations for custom systems |
| Quality checks | Usually separate or custom | Built into assets and runs |
| Lineage | Depends on edition and setup | Built into the pipeline graph and Cloud catalog |
| Governance | Enterprise configuration around the platform | Catalog, lineage, meta-keys, asset tiers, SSO, RBAC, audit logs |
| AI analytics | Not the core design | AI data analyst and DAC dashboards on governed pipeline context |
| Deployment | Server-admin heavy | Local, CI, cloud, VPC, on-prem, or Bruin Cloud |
Do not rewrite everything first. That is how migrations become expensive theatre.
Start with one flow:
The small detail that matters: add the checks before you declare victory. A pipeline that merely runs is not migrated. A pipeline that proves its output is healthy is migrated.
Pentaho migration
Send us the shape of your current PDI or Kettle setup. The Bruin team can help map what becomes ingestion, SQL/Python, checks, Bruin Cloud orchestration, MCP access, and DAC dashboards.
Pentaho can still make sense if:
That is fine. Not every old system needs to be replaced just because it is old.
Bruin is the better fit if:
That last point is the big one. A modern data platform is not just the thing that moves rows. It is the context layer around the rows: ownership, lineage, freshness, definitions, access, quality, and audit.
If Pentaho is still working and nobody is asking for better governance, AI analytics, or code review, keep it.
But if your team is already discussing support windows, CE risk, Java/runtime maintenance, visual-job sprawl, or how to make old ETL feed a modern AI analyst, Bruin is worth testing. Start with one pipeline. Make it boring. Prove parity. Then expand.
For the dedicated side-by-side, see Pentaho vs Bruin. For a broader shortlist, see the best data pipeline tools in 2026.

How Bruin CLI, Bruin MCP, Bruin Cloud, and agent skills can build and maintain a Salesforce to Snowflake ELT pipeline across bronze, silver, and gold layers.

Most AI data analysts live in Slack or a browser. Bruin runs in WhatsApp too. Here is why field, sales, and ops teams prefer asking their data questions there, what it takes to make it actually work, and how to roll it out safely.
Can you just use ChatGPT, Claude, or a coding agent like Codex to analyze your company data? Here is the honest difference between a general AI model and a purpose-built AI data analyst, why a model alone is not enough, and what it takes to get trustworthy answers from live company data.