AI-Assisted Development and Analysis
Set up the Bruin MCP, enhance your pipeline with AI-generated metadata, and use an AI agent to iterate on your pipeline and analyze your data.
What you'll do
- Set up the Bruin MCP so an AI agent can interact with your project
- Create an
AGENTS.mdto give the agent domain context - Enhance your pipeline assets with AI-generated metadata using
bruin ai enhance - Use the agent to iterate on your pipeline, run commands, and analyze your data
Prerequisites
You should have a working pipeline from the previous steps — ingestion, staging, and report assets that run successfully with bruin run ..
Instructions
1) Set up the Bruin MCP
The Bruin MCP (Model Context Protocol) gives AI agents direct access to your project — they can read your assets, run Bruin commands, query your data, and access Bruin documentation.
Pick your tool:
Cursor / VS Code
Go to Settings > MCP & Integrations > Add Custom MCP:
{
"mcpServers": {
"bruin": {
"command": "bruin",
"args": ["mcp"]
}
}
}
Claude Code
claude mcp add bruin -- bruin mcp
Codex
Add to ~/.codex/config.toml:
[mcp_servers.bruin]
command = "bruin"
args = ["mcp"]
Restart your IDE after adding the MCP. To verify it's working, ask the agent: "What connections are available in this Bruin project?"
2) Create an AGENTS.md
Create an AGENTS.md file at the root of your project to give the agent domain-specific context. This is like onboarding documentation for your AI agent — it tells it how to query data, what terms mean, and what to watch out for.
# AGENTS.md
## Data access
- Use `bruin query --connection duckdb-default --query "<SQL>"` for all data access
- Always show the SQL query and explain your reasoning before executing it
- Use `--limit 10` when exploring unfamiliar tables or testing queries
- Read the `assets/` directory to understand available tables and their schemas before querying
- This is a **read-only** environment — never run INSERT, UPDATE, DELETE, or DROP statements
## Domain context
- Trip data comes from the NYC Taxi & Limousine Commission (TLC)
- Payment types: 1=Credit Card, 2=Cash, 3=No Charge, 4=Dispute, 5=Unknown, 6=Voided
- Rate codes: 1=Standard, 2=JFK, 3=Newark, 4=Nassau/Westchester, 5=Negotiated, 6=Group
- Timestamps are in UTC
- Trip distances are in miles, fares are in USD
- Tip amounts are only reliably recorded for credit card payments
See the AI Data Analyst tutorial for a deeper dive on building context layers and AGENTS.md best practices.
3) Enhance your assets with AI
Your pipeline has working assets, but they're missing the metadata that makes them understandable — column descriptions, quality checks, and tags. The bruin ai enhance command fixes this automatically:
bruin ai enhance assets/
This connects to your DuckDB database, inspects the actual data, and uses AI to:
- Add descriptions for each asset and column based on names and data patterns
- Generate quality checks like
not_nullon IDs,accepted_valueson categorical columns, range checks on numeric fields - Apply tags to group related assets by domain
The command is conservative — it only adds metadata it's confident about and never overwrites existing content.
Which AI provider? The command auto-detects which AI CLI you have installed (Claude Code, OpenCode, or Codex). To specify one explicitly, use
--claude,--opencode, or--codex.
4) Review the enhanced assets
Open one of your asset files to see what was generated:
# Example: what bruin ai enhance adds to your staging asset
description: "Cleaned and deduplicated taxi trip records joined with zone lookups"
tags:
- nyc-taxi
- staging
columns:
- name: trip_id
type: VARCHAR
description: "Unique identifier for the taxi trip"
checks:
- name: not_null
- name: unique
- name: payment_type
type: INTEGER
description: "Payment method code (1=Credit Card, 2=Cash, 3=No Charge, 4=Dispute)"
checks:
- name: accepted_values
value: [1, 2, 3, 4, 5, 6]
Review the generated metadata — fix any descriptions that don't match your understanding, and remove any unique checks that don't apply.
5) Use the agent to iterate on your pipeline
With MCP connected, the agent can read your assets, understand their structure, and help you make changes. Try:
"Add a new report asset that calculates the average trip distance and fare by pickup zone."
"Add a
not_nullquality check to all ID columns across my pipeline."
"What dependencies does the staging asset have? Show me the lineage."
The agent will create or modify asset files, then you can validate and run:
bruin validate .
bruin run .
6) Analyze your data with natural language
The agent can also query your DuckDB database directly using bruin query:
"Which day had the highest number of trips? What was the total fare?"
"Show me the top 10 pickup zones by trip count, with average fare and distance."
"Are there any days with unusually low trip counts that might indicate data quality issues?"
The agent translates your questions into SQL, runs them against your database, and returns the results.
7) Optional: Deploy to Bruin Cloud
Take your pipeline to production by deploying it to Bruin Cloud. Sign up for free — no credit card required. The free tier includes credits to schedule and run your pipelines.
Once deployed, you also get access to:
- AI Data Analyst — ask questions about your data in natural language from Slack, Teams, Discord, or the browser. See the AI Data Analyst tutorial for a walkthrough.
- AI Dashboard Builder — generate dashboards with KPIs, charts, and filters from a single chat message. See the AI Dashboard Builder tutorial for details.
Watch the Bruin Cloud onboarding video for a step-by-step walkthrough of deploying your first pipeline.
What just happened
- The Bruin MCP lets AI agents read your project, run commands, and query your data directly
- An
AGENTS.mdfile gives the agent domain knowledge for more accurate results bruin ai enhanceturns raw assets into well-documented, quality-checked definitions that AI agents can understand- You can use the agent for both pipeline development (creating/modifying assets) and data analysis (natural language queries)
- Bruin Cloud gives you scheduling, monitoring, and AI-powered analysis in production — free to get started