BigQuery TVFs for AI Agents: A Lightweight Semantic Layer Pattern

Quick Answer

A semantic layer defines business metrics and dimensions in one place, then lets tools generate SQL from those definitions.

A BigQuery table-valued function, or TVF, is a SQL function that returns a table and accepts parameters. For AI agents, TVFs can work like a small, controlled semantic interface: the data team writes the approved query logic once, and the agent only provides predefined inputs like dates, country, platform, or customer segment.

This is useful when you want an AI data analyst to answer common business questions without asking it to write full SQL from scratch every time.

What is a semantic layer?

A semantic layer is where you define the business meaning of your data. Things like revenue, active_user, monthly_recurring_revenue, customer_segment, and so on.

Instead of every dashboard, analyst, or agent writing their own version of the SQL, the definition lives in one place. Tools can then use that definition to generate the query.

So if someone asks for revenue by month, the tool knows what revenue means, which table to query, what timestamp to group by, and which filters should apply.

This is useful for dashboards, reports, and self-serve analytics. It is also relevant for AI agents, because agents need context before they can query data reliably.

The classic text-to-SQL failure mode is pretty simple: the agent sees tables, columns, maybe a few descriptions, and then tries to infer business logic that usually lives in someone's head, a dashboard, or an old dbt model. That works until it doesn't.

What is a BigQuery TVF?

A TVF is basically a SQL function that returns a table. BigQuery's docs describe table functions as behaving a bit like views, except they can take parameters. You can define something like this:

create or replace table function analytics.revenue_by_period(
  p_start_date date,
  p_end_date date,
  p_country string
)
as (
  select
    date_trunc(order_date, month) as month,
    sum(order_amount - refunded_amount) as revenue
  from analytics.orders
  where order_date between p_start_date and p_end_date
    and (p_country is null or shipping_country = p_country)
    and status = 'paid'
  group by 1
);

Then the agent does not need to write the full query. It only needs to call the approved function:

select *
from analytics.revenue_by_period('2026-01-01', '2026-03-31', 'Canada');

The useful part is the constraint.

The agent gets a small menu of approved functions. Each function has predefined inputs. The query logic lives inside BigQuery, where it can be reviewed and maintained like any other SQL object.

For example, the agent does not need to understand the full orders schema, refund logic, order statuses, or date handling. It only needs to understand that revenue_by_period accepts a start date, end date, and optional country, then returns revenue grouped by month.

How TVFs can work like a semantic layer for agents

For some workflows, a TVF covers the most important part of a semantic layer: one reviewed place for business logic.

revenue means whatever the TVF defines it to mean. If the definition changes, you update the function, review the change, and future agent calls use the new logic.

A TVF is usually narrower than a semantic layer. A semantic layer often supports many combinations of metrics, dimensions, filters, and granularities. A TVF can focus on one question or one family of questions:

revenue_by_period(start_date, end_date, country)
active_users(start_date, end_date, platform)
top_customers_by_revenue(start_date, end_date, max_rows)
inventory_risk(category, warehouse_region)

This can be useful when the consumer is an AI agent. Agents are generally better at choosing parameters from a constrained interface than writing fresh SQL against a warehouse with limited context.

In practice, the agent prompt or tool description can say:

Use analytics.revenue_by_period when the user asks for revenue over time.
Required inputs:
- p_start_date: start date for the analysis
- p_end_date: end date for the analysis
- p_country: optional country filter, or null for all countries

That gives the agent enough structure to call the function without exposing every table and join behind it.

Why this helps with LLM reliability

LLMs are much better at mapping a question to a constrained interface than they are at reconstructing business logic from raw schema context.

With a TVF pattern, the hard part moves out of the prompt and into reviewed SQL:

Metric logic lives in BigQuery, not in a prompt.
The agent chooses parameters instead of inventing joins.
The data team can test, review, and version the function definition.
The query surface is small enough to document clearly.
Business users still ask questions naturally in Slack, Teams, or the browser.

This is not magic, and it will not make an unreliable data model reliable. But it does reduce the number of things the agent has to get right.

That's the real win.

When TVFs are enough

TVFs will not cover every semantic-layer use case. If you need flexible dashboard exploration, metric catalogues, lineage, access workflows, descriptions, ownership, and lots of reusable dimensions, then a semantic layer gives you a broader system.

If your goal is to let an agent answer common business questions with a limited set of approved patterns, TVFs are a practical starting point.

You can expose a handful of reviewed functions, document the inputs, give the agent examples, and stop asking it to generate complex SQL from scratch every time someone asks a question in Slack.

The query surface stays smaller, the data team keeps control over the logic, and the business still gets answers inside the conversation where the question came up.

For many teams, this is a reasonable first step before implementing a full semantic layer, especially if the first priority is defining and protecting the first few important metrics.

Where Bruin fits

This pattern works best when the TVFs are part of a proper analytics workflow, not a few disconnected SQL snippets sitting in the warehouse.

With Bruin, teams can keep ingestion, transformations, checks, and metadata in code. Bruin Cloud then adds the managed layer around that work: orchestration, catalog, lineage, audit trails, access controls, and observability.

That matters for AI agents because the agent is only as good as the context it can trust. A TVF can define the query interface, but the surrounding pipeline still needs freshness checks, ownership, lineage, and review.

BigQuery TVFs for AI Agents: A Lightweight Semantic Layer Pattern

Quick Answer

What is a semantic layer?

What is a BigQuery TVF?

How TVFs can work like a semantic layer for agents

Why this helps with LLM reliability

When TVFs are enough

Where Bruin fits

FAQ

Can BigQuery TVFs replace a semantic layer?

Why are TVFs useful for AI agents?

Is this the same as text-to-SQL?

What should I document for an agent that uses TVFs?

Other posts you might be interested in

NULL Keys Silently Break BigQuery Joins and MERGEs

What Is a Semantic Layer?

The Best AI Data Analyst Tools for Slack in 2026

Sign up to our newsletter