Wire Up the AI Agent

What you'll do

Register Bruin MCP in your AI coding tool so the agent can ask Bruin "how do I…?" questions
Add an AGENTS.md at the repo root that points the agent at context/assets/ and at bruin query
Verify the loop end-to-end with a real business question

Why this step matters

The context layer is half the equation. The agent also needs an interface to use it:

A way to learn how Bruin itself works (asset types, command flags) - that's what Bruin MCP gives you
A way to actually run SQL against the warehouse - that's bruin query, which uses the same connection you defined in step 2
A canonical workflow that tells the agent which of the two to use, when - that's AGENTS.md

Get all three in place and the agent stops guessing table names and starts citing the asset YAMLs you generated.

Instructions

1. Register Bruin MCP in your AI tool

Pick the tab that matches the tool you're using. Bruin MCP is stateless - register it once per machine, not per project.

Claude Code

Run this once in your terminal:

claude mcp add bruin -- bruin mcp

Restart your Claude Code session - MCP servers are loaded at session start, so the change won't apply until you open a new one.

To verify, ask Claude Code: "Use the Bruin MCP to list the available bruin commands." It should reply with a list pulled from bruin_get_overview rather than guessing.

Cursor

Open Cursor Settings (Cmd/Ctrl + ,)
Navigate to MCP & Integrations
Click Add Custom MCP
Paste:

{
  "mcpServers": {
    "bruin": {
      "command": "bruin",
      "args": ["mcp"]
    }
  }
}

Restart Cursor.

To verify, ask Cursor's chat: "Use the Bruin MCP to fetch the docs for bruin import database." It should pull the doc page via bruin_get_doc_content.

Codex

Add this to ~/.codex/config.toml:

[mcp_servers.bruin]
command = "bruin"
args = ["mcp"]

Restart Codex.

To verify, ask Codex: "Use Bruin MCP to list connection types Bruin supports." It should return the list from bruin_get_doc_content.

2. Confirm `bruin query` works as the agent will use it

Run a sanity-check query against your warehouse using the scoped config, exactly the way the agent will:

bruin query \
  --config-file context/.bruin.yml \
  --connection contoso_dbt_bq \
  --query "SELECT category_name, SUM(revenue_usd) AS rev
           FROM \`bruin-playground-arsalan.contoso_dbt_reports.rpt_revenue_by_segment\`
           WHERE year = 2024
           GROUP BY 1
           ORDER BY rev DESC
           LIMIT 10"

You should get a small result table back. Because the connection uses Application Default Credentials, no keyfiles change hands - the agent runs SQL as your gcloud auth application-default login identity.

3. Add an `AGENTS.md` at the repo root

Create AGENTS.md next to dbt_project.yml (the repo root, not inside context/). AI coding tools auto-load this file at session start, so it's the canonical place for "how to use this project". Adapt the paths and connection name to your setup:

# AGENTS.md

This repo contains a dbt project plus a Bruin context layer documenting the
warehouse it builds. Use this guide before reading code or running queries.

## Canonical workflow

1. **Read context first.** Before querying, open the relevant
   `context/assets/<schema>/<table>.asset.yml`. It has the description, grain,
   column docs, and quality checks for that table - written from real samples.
2. **Use Bruin MCP for tooling questions.** Anything like "how does
   `bruin import` work?" or "what asset types exist?" - call the MCP server,
   don't guess from training data.
3. **Use `bruin query` for SQL.** Always pass `--config-file context/.bruin.yml`
   so the scoped connection is used.
4. **Cite the asset(s) you read.** When answering, reference the
   `context/assets/...asset.yml` files you used.

## Data access

- Connection name: `contoso_dbt_bq`
- Auth: Application Default Credentials (inherits `gcloud auth application-default login`)
- This is **read-only.** Never INSERT, UPDATE, DELETE, MERGE, or DROP.
- Always show the SQL before executing it.
- Use `LIMIT 100` (or smaller) when exploring an unfamiliar table.

## Layout

- `models/` - dbt models. Don't run `dbt` for analysis questions; the
  warehouse is already built. Read these only when asked about transformation logic.
- `context/assets/contoso_dbt_raw/` - raw dlt-loaded tables (lowest level).
- `context/assets/contoso_dbt_staging/` - `stg_*` cleaned/typed views.
- `context/assets/contoso_dbt_reports/` - `rpt_*` mart-level reports.
  Prefer these for business questions; staging is for ad-hoc deep dives.

## Things to avoid

- **Don't hand-edit `context/assets/*.asset.yml`.** They're regenerated by
  `generate_context.sh`. Improve descriptions in the dbt model's `schema.yml`
  upstream and re-run the generator.
- **Don't mix configs across pipelines.** This project's connection lives in
  `context/.bruin.yml`; other Bruin pipelines in the repo have their own.
  Always use `--config-file`.
- **Don't trust row counts in descriptions.** They're snapshot-time and may
  be stale. If a question hinges on exact size, run `SELECT COUNT(*)` yourself.

4. Try the loop end-to-end

Open your AI tool in this repo and ask a real question, e.g.:

"Which retail categories had the largest year-over-year revenue change in 2024 vs. 2023? Show me your SQL before running it, and cite which context/assets/... files you used."

A correctly-wired agent will:

Open context/assets/contoso_dbt_reports/rpt_revenue_by_segment.asset.yml and confirm the grain
Draft a SQL query against contoso_dbt_reports.rpt_revenue_by_segment filtered by year
Show the SQL, wait for a go-ahead, then run bruin query --config-file context/.bruin.yml --connection contoso_dbt_bq --query "..."
Cite the asset YAML it read

If it skips step 1 (reading the asset), tighten AGENTS.md - the canonical workflow section is what enforces this behavior.

Lessons learned

Isolate .bruin.yml. A broken sibling connection breaks every command. Always --config-file-scope when working in a sub-pipeline.
ADC > keyfiles for agent workflows. No secrets to rotate, no files to gitignore, and the agent runs as the human's identity.
Filter loader-internal tables before enhance - otherwise Claude burns time describing _dlt_pipeline_state.
Always bruin validate after ai enhance. Cheap insurance against rare YAML corruption.
Re-run generate_context.sh after schema changes. Descriptions are snapshot-time; a column rename without regeneration leaves the agent quietly wrong.
Don't hand-edit generated YAMLs. Improve them upstream in the dbt model's schema.yml so the next import + enhance picks the change up.

Adapting this to a different dbt project

The minimal recipe for any existing dbt + warehouse setup:

mkdir -p context/assets
Write context/.bruin.yml (scoped) and context/pipeline.yml
bruin import database --schemas <yours...> context
Delete loader-internal asset YAMLs
bruin ai enhance --claude context/assets
bruin validate --config-file context/.bruin.yml context
Add an AGENTS.md that points agents at context/assets/ and at bruin query

That's the whole context layer. Everything else (generate_context.sh, parity scripts, the contoso reference) is ergonomics on top.

What just happened

You've turned an existing dbt + warehouse setup into something an AI agent can navigate confidently: it knows your tables (from import), understands what they mean (from enhancement), can ask Bruin tooling questions (from MCP), can run SQL safely (from bruin query + ADC), and follows a canonical workflow (from AGENTS.md). The dbt project keeps doing its job - building tables - and the Bruin context layer keeps doing the new one: making those tables legible to AI.

Wire Up the AI Agent

Learn more