Teach the AI Your Domain

What you'll do

Run bruin ai enhance so the agent auto-documents every table and column
Create a pipeline-specific AGENTS.md inside the stock-analyst-101/ folder with finance-specific domain knowledge, acronyms, and query guidelines

Data alone is not enough. An AI agent looking at a column called fcf will guess "something cash flow related" - fine for casual questions, wrong for serious analysis (is it free cash flow? operating cash flow? levered or unlevered? annual or TTM?). Your job in this step is to make sure the agent never has to guess.

You're giving the agent three layers of context now:

Workspace-level rules - the root AGENTS.md from Step 2 tells the agent how to work with Bruin (CLI, validate, test limits, etc.) across any pipeline in this workspace.
Schema context - what each table and column is. Auto-generated in seconds with bruin ai enhance.
Pipeline-specific domain context - how a stock analyst thinks about this pipeline's data. Lives next to the pipeline as stock-analyst-101/AGENTS.md so it travels with it and doesn't leak into other pipelines.

AI coding tools (Claude Code, Cursor, Codex) read nested AGENTS.md files when they work inside that subfolder, so the pipeline-scoped file automatically layers on top of the workspace-level one. Together these turn the agent from "eager intern" into "actual analyst."

1. Auto-enhance schema context

Prompt:

AI Prompt

Run bruin ai enhance stock-analyst-101/ and let me know when it's done. It should enhance assets across all three layers - raw.*, staging.*, and reports.*. Then show me the changes it made to fmp_fundamentals.asset.yml (raw) and reports/ticker_quarterly.sql (reports) so I can spot-check the descriptions.

This command looks at each table, asks an LLM to describe what it sees, and writes column descriptions directly into the asset YAML - across all three layers. After it runs, your fmp_fundamentals.asset.yml will have a columns: block like:

columns:
  - name: symbol
    type: varchar
    description: "Stock ticker symbol (e.g. AAPL)"
  - name: date
    type: date
    description: "Fiscal period end date, not calendar quarter"
  - name: revenue
    type: bigint
    description: "Total revenue in USD for the fiscal quarter"
  # ...many more

These descriptions become part of the MCP context the agent reads before every query. You don't need to memorize any of this - the agent will look it up when it needs to.

If the agent got a description wrong (e.g. it misread what a column means), just ask it to fix that specific line. The YAML is yours to edit.

2. Create a pipeline-specific AGENTS.md

The root AGENTS.md you seeded in Step 2 covers how the agent should work with Bruin in general. This new file covers what this specific pipeline's data means. It lives inside the pipeline folder (stock-analyst-101/AGENTS.md) so the finance-specific context stays scoped to the pipeline it belongs to - if you later add a different pipeline in the same workspace, you won't pollute it with finance glossary entries.

Prompt the agent:

AI Prompt

Create a new file at stock-analyst-101/AGENTS.md (inside the pipeline folder - not the workspace root) with the content below. Do not modify the root AGENTS.md; the pipeline-specific file layers on top of it automatically when the agent works inside the pipeline folder. After you're done, show me the file.

# AGENTS.md - stock-analyst-101

## Pipeline overview
This is a stock-analyst research pipeline. Data lives in DuckDB
(`./stock-analyst-101/stock.duckdb`), organized in three layers:

**Raw layer** (`raw.*`) - unmodified from source:
- `raw.fmp_fundamentals` - quarterly income statement, balance sheet, cash flow per ticker (from FMP)
- `raw.yahoo_prices` - daily OHLCV stock prices per ticker (from Yahoo Finance)
- `raw.fred_macro` - US macroeconomic time series (from FRED)

**Staging layer** (`staging.*`) - cleaned, typed, deduplicated:
- `staging.fundamentals` - pivoted fundamentals, one row per (symbol, fiscal_quarter_end), includes `free_cash_flow`
- `staging.prices` - cleaned daily prices with `daily_return`, `sma_20`, `sma_50`
- `staging.macro` - pivoted FRED series, forward-filled, with a `regime` column ('inverted' | 'normal')

**Reports layer** (`reports.*`) - analysis-ready, joined, pre-computed metrics:
- `reports.ticker_quarterly` - one row per (symbol, fiscal_quarter_end) with TTM revenue/FCF/FCF margin, QoQ + YoY growth, quarterly price return, and the macro regime at quarter end

The DuckDB connection name is `duckdb-default`.

## Layer usage rules (IMPORTANT)
**Default to `reports.*` for all analysis.** It's pre-cleaned, pre-joined, and contains the metrics an analyst typically wants.

Only drop to lower layers with an explicit reason:
- **`staging.*`** - when reports doesn't surface a column you need (e.g. operating margin, EPS), or when validating a transform
- **`raw.*`** - for **data validation** (is the source complete?), **troubleshooting** (where did a wrong number come from?), or when you genuinely need a column that hasn't been promoted upstream yet

When you do drop down, **say so in your response**: "I'm using `staging.fundamentals` here because `reports.ticker_quarterly` doesn't expose operating margin." This keeps the user aware of which surface you're querying.

## Domain glossary (finance)
- **FCF** - Free Cash Flow: operating cash flow minus capital expenditures
- **FCF margin** - FCF divided by revenue, expressed as a percentage
- **EPS** - Earnings Per Share (diluted unless otherwise stated)
- **P/E** - Price-to-Earnings ratio (trailing 12 months)
- **EBITDA** - Earnings Before Interest, Taxes, Depreciation, Amortization
- **TTM** - Trailing Twelve Months (sum of last 4 quarters)
- **YoY / QoQ** - Year-over-Year / Quarter-over-Quarter comparisons
- **Market cap** - shares outstanding × last closing price
- **Yield curve inversion** - when `T10Y2Y` (10Y–2Y Treasury spread) is negative
- **Real GDP growth** - QoQ change in `GDPC1`, annualized

## FRED series reference
- `FEDFUNDS` - Effective Fed Funds Rate (monthly, %)
- `CPIAUCSL` - CPI All Urban Consumers (monthly, index 1982-84=100). YoY change = inflation.
- `UNRATE` - Unemployment rate (monthly, %)
- `T10Y2Y` - 10Y minus 2Y Treasury yield spread (daily, %). Negative = inversion.
- `GDPC1` - Real GDP (quarterly, billions of chained 2017 dollars)

## Data caveats
- All dates are UTC. FMP dates are fiscal quarter end (may not match calendar quarter)
- `raw.fmp_fundamentals.date` is the fiscal period end, not filing date - there's typically a 30–60 day lag between period end and when the numbers are reported
- `raw.yahoo_prices.close` is adjusted for splits and dividends; `raw.yahoo_prices.open` is not
- `raw.fred_macro` series have different frequencies (daily / monthly / quarterly). When joining to prices, align on the price date and use the most recent macro value (`last_value ignore nulls over ...` or a backward-looking window)

## Query guidelines
- For all analytical questions, start with `reports.ticker_quarterly`. Most questions are a SELECT against this table with a WHERE on `regime_at_qend` or a comparison across symbols.
- When comparing fundamentals across time, prefer the pre-computed `ttm_*` columns over recomputing TTM from `staging.fundamentals`
- If you need price detail beyond `quarterly_return` and `price_at_quarter_end`, drop to `staging.prices` and join on (symbol, date) - but always filter by date range first; the table has years of daily data per ticker
- To get latest fundamentals per ticker from staging: `QUALIFY ROW_NUMBER() OVER (PARTITION BY symbol ORDER BY fiscal_quarter_end DESC) = 1`
- Never join `staging.fundamentals` directly to `staging.prices` on date - their grains differ. Use `reports.ticker_quarterly` (already aligned) or align fundamentals to the next available price date.

When the agent is done, your workspace has two AGENTS.md files working in tandem: the root one with Bruin rules (applies everywhere), and stock-analyst-101/AGENTS.md with finance domain context (applies when the agent is working inside this pipeline). Restart your AI coding tool (new Claude Code session / reopen Cursor) so it picks up the new file.

3. Sanity-check the context

Ask your agent a scoping question that should only be answerable if it read AGENTS.md:

AI Prompt

Which FRED series in this project captures yield-curve inversion, and what value indicates one?

The agent should answer with T10Y2Y and "negative values". If it fumbles or asks you instead, the pipeline-level AGENTS.md wasn't loaded - restart the session, and make sure the agent is working inside the stock-analyst-101/ folder.

What just happened

Your agent now has three layers of context: workspace-level Bruin rules (root AGENTS.md), per-column schema descriptions (auto-generated into the asset YAML), and pipeline-specific domain knowledge (stock-analyst-101/AGENTS.md). Every question you ask from here on out is reasoned against this stack rather than guessed from column names. You've just done what takes a new human analyst a month of onboarding.

Teach the AI Your Domain

What you'll do

Why this step matters

1. Auto-enhance schema context

2. Create a pipeline-specific AGENTS.md

3. Sanity-check the context

What just happened

Learn more