Education
8 min read

What Is a Semantic Layer?

A practical explanation of semantic layers, why metric definitions drift across dashboards, and how tools like DAC turn reusable metrics, dimensions, filters, and segments into SQL.

Arsalan Noorafkan

Developer Advocate

Most dashboard problems start with a simple question: what does this metric mean?

Take revenue as an example. Does it include refunds? Is tax included? What about cancelled orders? Do we count the transaction date, the invoice date, or the day the payment settled?

Questions like these are where the semantic layer conversation usually starts.

Metric logic can spread across dashboards very quickly. One dashboard has sum(amount), another has sum(amount) where status = 'paid', and a third dashboard joins refunds differently. After a while, the team spends more time reconciling numbers than improving the analysis.

A semantic layer defines the business meaning of your data once, then lets people reuse those definitions wherever they analyze that data. The hard part is keeping those definitions close to the dashboards, queries, and business context that depend on them.

This post covers what a semantic layer does, where it helps, where it can become painful, and how this usually works in practice.

I'll use DAC as the example implementation throughout the post. DAC is Bruin's open-source dashboard-as-code tool, and it has a built-in semantic layer. The concepts are common across semantic-layer tools though. Give or take, most of them follow a similar pattern: define models, dimensions, metrics, filters, and segments, then let the tool compile those definitions into SQL.

What a Semantic Layer Actually Does

A semantic layer sits between raw tables and the tools people use to ask questions, build dashboards, or run analysis.

Instead of every dashboard writing its own SQL for the same concept, the semantic layer gives you reusable definitions:

  • dimensions: the fields you group, filter, or slice by, like region, channel, or created_at
  • metrics: the values you calculate, like revenue, sales_count, or average_order_value
  • segments: reusable filters, like online_orders, paid_customers, or enterprise_accounts
  • formats and labels: the human-facing details that stop every chart from inventing its own naming convention

For example, a dashboard author might currently write this in every dashboard:

select
  date_trunc('month', created_at) as created_at,
  sum(amount) as revenue
from sales
where channel = 'online'
group by 1
order by 1

You can say:

dimension: created_at
granularity: month
metrics: [revenue]
segments: [online]

The semantic tool turns that small definition into SQL by looking up each referenced piece:

  • model tells it which table or relation to query
  • dimension: created_at tells it which column or expression to select
  • granularity: month tells it to use the monthly version of the time dimension, such as date_trunc('month', created_at)
  • metrics: [revenue] tells it to select the SQL expression behind revenue, such as sum(amount)
  • segments: [online] tells it to apply the reusable filter behind online, such as channel = 'online'
  • sort and limit, when present, become the final order by and limit clauses

The tool is assembling a SQL query from definitions that already exist in the semantic model. In plain terms: the dashboard asks for revenue by month for online sales, and the semantic layer knows which table, columns, aggregations, and filters are needed to write that query.

You still need to understand the data and review the definitions. Repeated logic moves into one place where it can be version-controlled, validated, and reused.

Why Semantic Layers Usually Get Painful

Semantic layers are useful when they stay close to the actual data work. They become harder to maintain when they are separated from the models, dashboards, and review process.

In many teams, transformation logic lives in one place, dashboard logic lives somewhere else, metric definitions live in a separate tool, and business context lives in Slack, Notion, or a half-updated glossary. When someone asks why active_user means three different things, the investigation is usually manual.

If definitions are easy to review and update, people are more likely to use them. If they are hard to change, people tend to bypass them.

There are a few common failure modes:

  • definitions are difficult to edit
  • validation is weak or missing
  • metric changes are hard to review
  • dashboards can reference stale or incorrect definitions
  • the semantic layer becomes too abstract for normal dashboard development

One way to reduce this maintenance problem is to connect the semantic layer to the places where business context already gets discussed.

For example, imagine two sales managers are talking in Slack about what revenue should mean for a new dashboard. One says revenue should exclude cancelled orders. The other adds that refunds should be subtracted based on the refund date, not the original order date.

If there is an AI data analyst connected to the semantic definitions and available in Slack, they can tag it in the thread:

@Bruin update the revenue definition based on this thread.

The agent can read the conversation, find the relevant semantic model, propose an update to the revenue metric description or SQL expression, and open a change for review. The data team still owns the definition. The agent just helps turn context from a Slack thread into a reviewable change where the semantic definition actually lives.

This is where semantic maintenance can become part of normal conversation. Business people clarify meaning where they already work, and the data team reviews the resulting definition change in the same workflow they use for dashboards and models.

For a dashboard workflow, a practical goal is smaller: make reusable metric definitions part of normal dashboard development.

How Semantic Layers Are Usually Structured

A semantic layer usually starts with a model. A model points to a source table or relation, then defines the dimensions, metrics, and filters that are safe to reuse.

Some tools define this in a UI. Some use YAML. Some use code. The shape is usually similar, even if the syntax changes.

In this example, I'll use DAC's YAML format because it is easy to read. A DAC project can look like this:

my-dashboards/
  .bruin.yml
  dashboards/
    semantic-sales.yml
  semantic/
    sales.yml

The dashboard lives under dashboards/, and the semantic model lives under semantic/. DAC also reads connections from .bruin.yml, but in other tools this could be a warehouse connection configured in the UI or a separate project config.

Here is a simplified semantic model:

name: sales
label: Sales
description: Semantic model over the sales table

source:
  table: sales

dimensions:
  - name: created_at
    type: time
    granularities:
      day: date_trunc('day', created_at)
      month: date_trunc('month', created_at)
  - name: region
    type: string
  - name: channel
    type: string

metrics:
  - name: revenue
    expression: sum(amount)
  - name: sales_count
    expression: count(*)
  - name: unique_customers
    expression: count(distinct customer_id)
  - name: avg_sale_value
    expression: "{revenue} / {sales_count}"

segments:
  - name: online
    filter: "channel = 'online'"

In this model:

  • the source table is sales
  • created_at, region, and channel are available as dimensions
  • created_at has day and month granularities
  • revenue, sales_count, unique_customers, and avg_sale_value are available as metrics
  • online is a reusable segment

Metrics can also reference other metrics with {metric_name}. In this example, avg_sale_value is defined as {revenue} / {sales_count}, so the dashboard author does not need to repeat the underlying expressions.

How a Dashboard Uses a Semantic Model

Once a dashboard sets model: sales, widgets can reference semantic fields directly:

name: Semantic Sales Example
connection: local_duckdb
model: sales

rows:
  - widgets:
      - name: Revenue
        type: metric
        metric: revenue
        filters:
          - dimension: region
            operator: equals
            value: "Europe"
        prefix: "$"
        format: number
        col: 3

      - name: Revenue Trend
        type: chart
        chart: area
        dimension: created_at
        granularity: month
        metrics: [revenue]
        segments: [online]
        sort:
          - name: created_at
            direction: asc
        col: 9

The dashboard author is describing the query in semantic terms:

  • show revenue
  • group it by created_at
  • use month granularity
  • apply the online segment
  • sort by the date

A semantic tool resolves that into SQL, then executes it against the selected connection.

The dashboard file stays readable. During review, one person can inspect how the dashboard uses the metric, and another can inspect the metric definition in the semantic model.

What Happens Under the Hood

Most semantic-layer tools follow the same rough flow when a dashboard runs.

The tool resolves the model, renders filter values, validates the referenced metrics and dimensions, compiles the semantic query into SQL, and executes it like any other query.

In DAC specifically, the backend loads the model from semantic/, or from a model alias if the dashboard uses one. It also renders dashboard filter values with the same Jinja-style syntax used by SQL dashboards.

So this:

dimension: created_at
granularity: month
metrics: [revenue]
segments: [online]

becomes a generated SQL query using the model's dimension expression, metric expression, source table, segment predicate, sort, and limit.

The semantic engine also handles a few practical cases:

  • metric filters, like online_revenue using sum(amount) only where channel = 'online'
  • derived metrics, like {revenue} / {sales_count}
  • time granularities on time dimensions
  • reusable named semantic queries in the dashboard queries map
  • filter operators like equals, between, in, gt, lte, is_null, and so on
  • window metrics such as running totals, lag, lead, rank, and percent of total

SQL generation happens inside the tool. The dashboard describes the semantic query, and the tool turns it into SQL before execution.

Validation Matters

Dashboards-as-code is useful when the files can be checked before they are shared or deployed.

Most semantic-layer tools need some kind of validation. For file-based tools, this can happen locally or in CI.

In DAC, dac validate checks dashboard structure and semantic references without executing queries. It catches things like:

  • missing semantic models
  • invalid model aliases
  • missing metric expressions
  • derived metrics that reference unknown metrics
  • dashboard widgets that reference missing dimensions, metrics, segments, filters, or sort fields
  • invalid semantic models only failing the dashboards that actually reference them

Then dac check executes the widget queries. For semantic widgets, DAC resolves the model, compiles the widget into SQL, and runs the generated query against the database.

These two commands answer different questions.

validate tells you the project is structurally coherent.

check tells you the dashboard works against live data.

And dac query lets you execute a single widget in isolation:

dac query --dir examples/semantic-yaml \
  --dashboard "Semantic Sales Example" \
  --widget "Revenue"

This gives analytics engineers a small local feedback loop for testing a dashboard widget.

Why This Fits Dashboard-as-Code

The semantic layer becomes easier to maintain when it lives with the dashboard code.

You can review a metric change in a PR. You can see which dashboard started using a new segment. You can run validation before the dashboard is shared. You can keep the semantic model small at first, then expand it as repeated logic shows up.

For most teams, the best starting point is the repeated definitions that already cause confusion.

Revenue. Active users. Churn. Trial conversion. Paid accounts. Whatever metric keeps getting copied around and slightly changed.

Define those properly. Add the dimensions people actually use. Add a few segments that show up everywhere. Then let the layer grow from real dashboard usage.

File-based dashboard tools are useful here because the semantic model sits next to the dashboard files, and the generated SQL can be checked through the same development workflow.

A Small Example Using DAC

Here is what the workflow looks like in DAC. The command details are specific to DAC, but the pattern is common across semantic-layer tools: initialize a project, define semantic models, reference them from dashboards, validate, then execute the queries.

dac init my-dashboards --template semantic
cd my-dashboards
dac validate --dir .
dac check --dir .
dac serve --dir . --open

The generated project includes a semantic model under semantic/sales.yml and a dashboard under dashboards/semantic-sales.yml.

From there, the workflow is:

  • define the source table once
  • add dimensions and time granularities
  • add metrics and derived metrics
  • reference them from metric, chart, or table widgets
  • validate before review
  • check against live data before sharing

A semantic layer is only useful if people trust it. Trust comes from visible definitions, reviewable changes, and tests against the database.

When This Is Useful

Some teams need a large semantic platform, especially when governance, access control, multi-tool consumption, and cross-domain modelling are major requirements.

A lot of teams need something smaller to start with: a clear place to define the metrics and dimensions their dashboards already depend on.

DAC is one example of that approach. Other tools may use different interfaces or syntax, but the core workflow is similar.

Define metrics once, reuse them in dashboards, validate references, compile the SQL, and keep the definitions somewhere the team can review.

It still requires clear business definitions and careful modelling. If the team has not agreed on what revenue means, the tool cannot solve that by itself.

Once the definition exists, the tool should make it easy to reuse that definition across dashboards and check that the references still work.