Teach the AI Your Domain
Use bruin ai enhance to document every column, then add a pipeline-scoped AGENTS.md with finance context so the agent reasons about your data like a real analyst.
What you'll do
- Run
bruin ai enhanceso the agent auto-documents every table and column - Create a pipeline-specific
AGENTS.mdinside thestock-analyst-101/folder with finance-specific domain knowledge, acronyms, and query guidelines
Why this step matters
Data alone is not enough. An AI agent looking at a column called fcf will guess "something cash flow related" - fine for casual questions, wrong for serious analysis (is it free cash flow? operating cash flow? levered or unlevered? annual or TTM?). Your job in this step is to make sure the agent never has to guess.
You're giving the agent three layers of context now:
- Workspace-level rules - the root
AGENTS.mdfrom Step 2 tells the agent how to work with Bruin (CLI, validate, test limits, etc.) across any pipeline in this workspace. - Schema context - what each table and column is. Auto-generated in seconds with
bruin ai enhance. - Pipeline-specific domain context - how a stock analyst thinks about this pipeline's data. Lives next to the pipeline as
stock-analyst-101/AGENTS.mdso it travels with it and doesn't leak into other pipelines.
AI coding tools (Claude Code, Cursor, Codex) read nested AGENTS.md files when they work inside that subfolder, so the pipeline-scoped file automatically layers on top of the workspace-level one. Together these turn the agent from "eager intern" into "actual analyst."
1. Auto-enhance schema context
Prompt:
Run bruin ai enhance stock-analyst-101/ and let me know when it's done. It should enhance assets across all three layers - raw.*, staging.*, and reports.*. Then show me the changes it made to fmp_fundamentals.asset.yml (raw) and reports/ticker_quarterly.sql (reports) so I can spot-check the descriptions.
This command looks at each table, asks an LLM to describe what it sees, and writes column descriptions directly into the asset YAML - across all three layers. After it runs, your fmp_fundamentals.asset.yml will have a columns: block like:
columns:
- name: symbol
type: varchar
description: "Stock ticker symbol (e.g. AAPL)"
- name: date
type: date
description: "Fiscal period end date, not calendar quarter"
- name: revenue
type: bigint
description: "Total revenue in USD for the fiscal quarter"
# ...many more
These descriptions become part of the MCP context the agent reads before every query. You don't need to memorize any of this - the agent will look it up when it needs to.
If the agent got a description wrong (e.g. it misread what a column means), just ask it to fix that specific line. The YAML is yours to edit.
2. Create a pipeline-specific AGENTS.md
The root AGENTS.md you seeded in Step 2 covers how the agent should work with Bruin in general. This new file covers what this specific pipeline's data means. It lives inside the pipeline folder (stock-analyst-101/AGENTS.md) so the finance-specific context stays scoped to the pipeline it belongs to - if you later add a different pipeline in the same workspace, you won't pollute it with finance glossary entries.
Prompt the agent:
Create a new file at stock-analyst-101/AGENTS.md (inside the pipeline folder - not the workspace root) with the content below. Do not modify the root AGENTS.md; the pipeline-specific file layers on top of it automatically when the agent works inside the pipeline folder. After you're done, show me the file.
# AGENTS.md - stock-analyst-101
## Pipeline overview
This is a stock-analyst research pipeline. Data lives in DuckDB
(`./stock-analyst-101/stock.duckdb`), organized in three layers:
**Raw layer** (`raw.*`) - unmodified from source:
- `raw.fmp_fundamentals` - quarterly income statement, balance sheet, cash flow per ticker (from FMP)
- `raw.yahoo_prices` - daily OHLCV stock prices per ticker (from Yahoo Finance)
- `raw.fred_macro` - US macroeconomic time series (from FRED)
**Staging layer** (`staging.*`) - cleaned, typed, deduplicated:
- `staging.fundamentals` - pivoted fundamentals, one row per (symbol, fiscal_quarter_end), includes `free_cash_flow`
- `staging.prices` - cleaned daily prices with `daily_return`, `sma_20`, `sma_50`
- `staging.macro` - pivoted FRED series, forward-filled, with a `regime` column ('inverted' | 'normal')
**Reports layer** (`reports.*`) - analysis-ready, joined, pre-computed metrics:
- `reports.ticker_quarterly` - one row per (symbol, fiscal_quarter_end) with TTM revenue/FCF/FCF margin, QoQ + YoY growth, quarterly price return, and the macro regime at quarter end
The DuckDB connection name is `duckdb-default`.
## Layer usage rules (IMPORTANT)
**Default to `reports.*` for all analysis.** It's pre-cleaned, pre-joined, and contains the metrics an analyst typically wants.
Only drop to lower layers with an explicit reason:
- **`staging.*`** - when reports doesn't surface a column you need (e.g. operating margin, EPS), or when validating a transform
- **`raw.*`** - for **data validation** (is the source complete?), **troubleshooting** (where did a wrong number come from?), or when you genuinely need a column that hasn't been promoted upstream yet
When you do drop down, **say so in your response**: "I'm using `staging.fundamentals` here because `reports.ticker_quarterly` doesn't expose operating margin." This keeps the user aware of which surface you're querying.
## Domain glossary (finance)
- **FCF** - Free Cash Flow: operating cash flow minus capital expenditures
- **FCF margin** - FCF divided by revenue, expressed as a percentage
- **EPS** - Earnings Per Share (diluted unless otherwise stated)
- **P/E** - Price-to-Earnings ratio (trailing 12 months)
- **EBITDA** - Earnings Before Interest, Taxes, Depreciation, Amortization
- **TTM** - Trailing Twelve Months (sum of last 4 quarters)
- **YoY / QoQ** - Year-over-Year / Quarter-over-Quarter comparisons
- **Market cap** - shares outstanding × last closing price
- **Yield curve inversion** - when `T10Y2Y` (10Y–2Y Treasury spread) is negative
- **Real GDP growth** - QoQ change in `GDPC1`, annualized
## FRED series reference
- `FEDFUNDS` - Effective Fed Funds Rate (monthly, %)
- `CPIAUCSL` - CPI All Urban Consumers (monthly, index 1982-84=100). YoY change = inflation.
- `UNRATE` - Unemployment rate (monthly, %)
- `T10Y2Y` - 10Y minus 2Y Treasury yield spread (daily, %). Negative = inversion.
- `GDPC1` - Real GDP (quarterly, billions of chained 2017 dollars)
## Data caveats
- All dates are UTC. FMP dates are fiscal quarter end (may not match calendar quarter)
- `raw.fmp_fundamentals.date` is the fiscal period end, not filing date - there's typically a 30–60 day lag between period end and when the numbers are reported
- `raw.yahoo_prices.close` is adjusted for splits and dividends; `raw.yahoo_prices.open` is not
- `raw.fred_macro` series have different frequencies (daily / monthly / quarterly). When joining to prices, align on the price date and use the most recent macro value (`last_value ignore nulls over ...` or a backward-looking window)
## Query guidelines
- For all analytical questions, start with `reports.ticker_quarterly`. Most questions are a SELECT against this table with a WHERE on `regime_at_qend` or a comparison across symbols.
- When comparing fundamentals across time, prefer the pre-computed `ttm_*` columns over recomputing TTM from `staging.fundamentals`
- If you need price detail beyond `quarterly_return` and `price_at_quarter_end`, drop to `staging.prices` and join on (symbol, date) - but always filter by date range first; the table has years of daily data per ticker
- To get latest fundamentals per ticker from staging: `QUALIFY ROW_NUMBER() OVER (PARTITION BY symbol ORDER BY fiscal_quarter_end DESC) = 1`
- Never join `staging.fundamentals` directly to `staging.prices` on date - their grains differ. Use `reports.ticker_quarterly` (already aligned) or align fundamentals to the next available price date.
When the agent is done, your workspace has two AGENTS.md files working in tandem: the root one with Bruin rules (applies everywhere), and stock-analyst-101/AGENTS.md with finance domain context (applies when the agent is working inside this pipeline). Restart your AI coding tool (new Claude Code session / reopen Cursor) so it picks up the new file.
3. Sanity-check the context
Ask your agent a scoping question that should only be answerable if it read AGENTS.md:
Which FRED series in this project captures yield-curve inversion, and what value indicates one?
The agent should answer with T10Y2Y and "negative values". If it fumbles or asks you instead, the pipeline-level AGENTS.md wasn't loaded - restart the session, and make sure the agent is working inside the stock-analyst-101/ folder.
What just happened
Your agent now has three layers of context: workspace-level Bruin rules (root AGENTS.md), per-column schema descriptions (auto-generated into the asset YAML), and pipeline-specific domain knowledge (stock-analyst-101/AGENTS.md). Every question you ask from here on out is reasoned against this stack rather than guessed from column names. You've just done what takes a new human analyst a month of onboarding.