AI-Enhance and Validate the Context
Use bruin ai enhance to fill every asset with descriptions, tags, and quality checks — then bruin validate to make sure nothing got corrupted along the way.

What you'll do
- Run
bruin ai enhanceovercontext/assets/so every asset gets a description, semantic tags, per-column docs, and quality checks - Run
bruin validateto confirm none of the YAMLs ended up malformed
Why this step matters
Without descriptions, an AI agent can read your schema but doesn't know what it means. It sees gmv and guesses; it sees status = 3 and queries blindly; it sees a created_at column and assumes UTC. Enhancement is what turns the structural skeleton from the previous step into something an agent can actually reason about.
Validation matters because ai enhance writes to YAML files at scale, and rare edge cases can produce malformed files. A 30-second bruin validate is cheap insurance that catches them immediately, before they confuse an agent at query time.
Instructions
1. Run the AI enhancement
From the dbt project root:
bruin ai enhance --claude context/assets
For each asset, Bruin sends the column list + a sample of the data to Claude and fills in:
- A multi-paragraph description covering purpose, grain, lineage, and typical use
- Semantic tags like
domain:retail,layer:staging,sensitivity:pii - Per-column descriptions with business meaning
- Quality checks —
not_nullon keys,uniqueon identifiers,accepted_valueson enums
The command auto-detects which AI CLI you have installed. If you have several, pass an explicit flag — --claude, --opencode, --codex, or --cursor.
Time estimate. Each asset costs minutes of Claude time. For ~40 assets (the contoso reference), expect 30–60 minutes wall-clock. Bruin parallelizes up to 5 by default — increase with
--concurrency 10if you want it faster and your AI quota tolerates it.
Gotcha —
ai enhancedoesn't always honor--config-file. It can fall back to your global~/.bruin.ymlfor connection lookup, and if that has a broken connection you'll see "fill columns failed" warnings. The warnings are cosmetic — column types were already filled by the import step. The enhancement still writes correctly.
Gotcha — rare YAML corruption. On a small fraction of assets,
ai enhancehas been known to mangle thecolumns:block. Always runbruin validateafterward (next step). If a single asset breaks, regenerate just that file:bruin ai enhance --claude context/assets/<schema>/<table>.asset.yml.
2. Spot-check a single asset
Open one of the report assets — these benefit most from enrichment because the column names alone don't tell the full story:
cat context/assets/contoso_dbt_reports/rpt_revenue_by_segment.asset.yml
You should now see something like:
name: contoso_dbt_reports.rpt_revenue_by_segment
type: bq.source
description: |
Yearly revenue rolled up by product segment and category. Built from the
staging order-line table joined with the product dimension. One row per
(segment_id, category_name, year). Used by retail merchandising and
finance for category-level reporting.
tags:
- domain:retail
- layer:reports
- grain:segment_category_year
columns:
- name: segment_id
type: STRING
description: "Identifier for the product segment (joins to dim_segment)."
checks:
- name: not_null
- name: category_name
type: STRING
description: "Human-readable category label, e.g. 'Bikes', 'Components'."
checks:
- name: not_null
- name: year
type: INT64
description: "Calendar year of the order date, in UTC."
- name: revenue_usd
type: NUMERIC
description: "Sum of order_line.gross_amount in USD, post-discount."
This is what the agent will read before it queries. The richer this gets, the better its SQL gets.
Watch for incorrect
uniquechecks. AI enhancement sometimes addsuniqueto columns that look like keys but aren't unique per row (e.g.,segment_idin a yearly fact table appears once per year, not once total). Skim the generated checks and remove any that don't match how the data actually works.
3. Validate the whole pipeline
bruin validate --config-file context/.bruin.yml context
Expected output:
✓ Successfully validated 40 assets across 1 pipeline, all good.
If anything fails, the message will name the file and the line. Open it, fix or regenerate that single asset, and re-run validate until it's green.
4. Wrap it in a regenerator script (optional)
The whole sequence — config, import, filter, enhance, validate — is idempotent and worth wrapping in a script so you can refresh the context layer whenever your dbt models change. The reference project has generate_context.sh with --skip-import (re-enhance only) and --skip-enhance (fast structure refresh) flags. A minimal version:
#!/usr/bin/env bash
set -euo pipefail
CONFIG="context/.bruin.yml"
PIPELINE="context"
bruin import database \
--config-file "$CONFIG" \
--connection contoso_dbt_bq \
--schemas contoso_dbt_raw \
--schemas contoso_dbt_staging \
--schemas contoso_dbt_reports \
"$PIPELINE"
find "$PIPELINE/assets" -name "_dlt_*.asset.yml" -delete
bruin ai enhance --claude "$PIPELINE/assets"
bruin validate --config-file "$CONFIG" "$PIPELINE"
Save it as generate_context.sh next to your dbt project. Run it after meaningful schema changes — column renames, new models, dropped tables — to keep the context layer in sync.
Don't hand-edit generated YAMLs. They're regenerable artifacts. If a description is consistently wrong, fix it upstream — in the dbt model's
schema.ymlordescription:block — and the next import + enhance will pick the change up.
What just happened
Your context/assets/ is now a 40-file knowledge base: every dbt-materialized table is documented with descriptions, tags, and checks that an AI agent can read before writing a single query. Combined with the warehouse connection from step 2, you have everything an agent needs except the wiring that lets it actually call out to all of this. That's the next step.