Step 4-12 min

AI-Assisted Development and Analysis

Set up the Bruin MCP, enhance your pipeline with AI-generated metadata, and use an AI agent to iterate on your pipeline and analyze your data.

What you'll do

  1. Set up the Bruin MCP so an AI agent can interact with your project
  2. Create an AGENTS.md to give the agent domain context
  3. Enhance your pipeline assets with AI-generated metadata using bruin ai enhance
  4. Use the agent to iterate on your pipeline, run commands, and analyze your data

Prerequisites

You should have a working pipeline from the previous steps — ingestion, staging, and report assets that run successfully with bruin run ..

Instructions

1) Set up the Bruin MCP

The Bruin MCP (Model Context Protocol) gives AI agents direct access to your project — they can read your assets, run Bruin commands, query your data, and access Bruin documentation.

Pick your tool:

Cursor / VS Code

Go to Settings > MCP & Integrations > Add Custom MCP:

{
  "mcpServers": {
    "bruin": {
      "command": "bruin",
      "args": ["mcp"]
    }
  }
}

Claude Code

claude mcp add bruin -- bruin mcp

Codex

Add to ~/.codex/config.toml:

[mcp_servers.bruin]
command = "bruin"
args = ["mcp"]

Restart your IDE after adding the MCP. To verify it's working, ask the agent: "What connections are available in this Bruin project?"

2) Create an AGENTS.md

Create an AGENTS.md file at the root of your project to give the agent domain-specific context. This is like onboarding documentation for your AI agent — it tells it how to query data, what terms mean, and what to watch out for.

# AGENTS.md

## Data access
- Use `bruin query --connection duckdb-default --query "<SQL>"` for all data access
- Always show the SQL query and explain your reasoning before executing it
- Use `--limit 10` when exploring unfamiliar tables or testing queries
- Read the `assets/` directory to understand available tables and their schemas before querying
- This is a **read-only** environment — never run INSERT, UPDATE, DELETE, or DROP statements

## Domain context
- Trip data comes from the NYC Taxi & Limousine Commission (TLC)
- Payment types: 1=Credit Card, 2=Cash, 3=No Charge, 4=Dispute, 5=Unknown, 6=Voided
- Rate codes: 1=Standard, 2=JFK, 3=Newark, 4=Nassau/Westchester, 5=Negotiated, 6=Group
- Timestamps are in UTC
- Trip distances are in miles, fares are in USD
- Tip amounts are only reliably recorded for credit card payments

See the AI Data Analyst tutorial for a deeper dive on building context layers and AGENTS.md best practices.

3) Enhance your assets with AI

Your pipeline has working assets, but they're missing the metadata that makes them understandable — column descriptions, quality checks, and tags. The bruin ai enhance command fixes this automatically:

bruin ai enhance assets/

This connects to your DuckDB database, inspects the actual data, and uses AI to:

  • Add descriptions for each asset and column based on names and data patterns
  • Generate quality checks like not_null on IDs, accepted_values on categorical columns, range checks on numeric fields
  • Apply tags to group related assets by domain

The command is conservative — it only adds metadata it's confident about and never overwrites existing content.

Which AI provider? The command auto-detects which AI CLI you have installed (Claude Code, OpenCode, or Codex). To specify one explicitly, use --claude, --opencode, or --codex.

4) Review the enhanced assets

Open one of your asset files to see what was generated:

# Example: what bruin ai enhance adds to your staging asset
description: "Cleaned and deduplicated taxi trip records joined with zone lookups"
tags:
  - nyc-taxi
  - staging
columns:
  - name: trip_id
    type: VARCHAR
    description: "Unique identifier for the taxi trip"
    checks:
      - name: not_null
      - name: unique
  - name: payment_type
    type: INTEGER
    description: "Payment method code (1=Credit Card, 2=Cash, 3=No Charge, 4=Dispute)"
    checks:
      - name: accepted_values
        value: [1, 2, 3, 4, 5, 6]

Review the generated metadata — fix any descriptions that don't match your understanding, and remove any unique checks that don't apply.

5) Use the agent to iterate on your pipeline

With MCP connected, the agent can read your assets, understand their structure, and help you make changes. Try:

"Add a new report asset that calculates the average trip distance and fare by pickup zone."

"Add a not_null quality check to all ID columns across my pipeline."

"What dependencies does the staging asset have? Show me the lineage."

The agent will create or modify asset files, then you can validate and run:

bruin validate .
bruin run .

6) Analyze your data with natural language

The agent can also query your DuckDB database directly using bruin query:

"Which day had the highest number of trips? What was the total fare?"

"Show me the top 10 pickup zones by trip count, with average fare and distance."

"Are there any days with unusually low trip counts that might indicate data quality issues?"

The agent translates your questions into SQL, runs them against your database, and returns the results.

7) Optional: Deploy to Bruin Cloud

Take your pipeline to production by deploying it to Bruin Cloud. Sign up for free — no credit card required. The free tier includes credits to schedule and run your pipelines.

Once deployed, you also get access to:

  • AI Data Analyst — ask questions about your data in natural language from Slack, Teams, Discord, or the browser. See the AI Data Analyst tutorial for a walkthrough.
  • AI Dashboard Builder — generate dashboards with KPIs, charts, and filters from a single chat message. See the AI Dashboard Builder tutorial for details.

Watch the Bruin Cloud onboarding video for a step-by-step walkthrough of deploying your first pipeline.

What just happened

  • The Bruin MCP lets AI agents read your project, run commands, and query your data directly
  • An AGENTS.md file gives the agent domain knowledge for more accurate results
  • bruin ai enhance turns raw assets into well-documented, quality-checked definitions that AI agents can understand
  • You can use the agent for both pipeline development (creating/modifying assets) and data analysis (natural language queries)
  • Bruin Cloud gives you scheduling, monitoring, and AI-powered analysis in production — free to get started