Macros
Macros are reusable pieces of Jinja code that allow you to define SQL patterns once and use them throughout your pipeline. Think of them as functions for your SQL queries—they help you write cleaner, more maintainable code by eliminating repetition.
Why Use Macros?
Macros are useful when you find yourself:
- Writing the same SQL patterns repeatedly across multiple assets
- Building complex queries that could be simplified with reusable components
- Maintaining consistency in how certain operations are performed
- Wanting to abstract away complexity from your main query logic
Getting Started
Directory Structure
To use macros in your pipeline, create a macros/
folder at the root of your pipeline directory:
my-pipeline/
├── pipeline.yml
├── macros/
│ ├── aggregations.sql
│ ├── filters.sql
│ └── transformations.sql
└── assets/
├── customers.sql
└── orders.sql
File Naming
Macro files must end with .sql
. You can organize your macros into multiple files based on their purpose (e.g., aggregations.sql
, filters.sql
, transformations.sql
), or keep them all in a single file.
Automatic Loading
All macros defined in the macros/
folder are automatically loaded and available in all your assets. You don't need to import them explicitly—just define them once and use them anywhere.
Defining Macros
Basic Syntax
A macro is defined using the {% macro %}
tag:
{% macro macro_name(parameter1, parameter2) -%}
SQL code here
{%- endmacro %}
The {%-
and -%}
syntax strips whitespace, keeping your rendered output clean.
Simple Macro Without Parameters
Here's a basic macro that selects all columns from a table:
{% macro select_all(table_name) -%}
SELECT * FROM {{ table_name }}
{%- endmacro %}
Usage in an asset:
/* @bruin
name: my_asset
type: duckdb.sql
@bruin */
{{ select_all('customers') }}
Macro With Multiple Parameters
{% macro date_range_filter(table, date_column, start_date, end_date) -%}
SELECT *
FROM {{ table }}
WHERE {{ date_column }} >= '{{ start_date }}'
AND {{ date_column }} < '{{ end_date }}'
{%- endmacro %}
Macro With Default Parameters
{% macro top_n(table, column, n=10) -%}
SELECT *
FROM {{ table }}
ORDER BY {{ column }} DESC
LIMIT {{ n }}
{%- endmacro %}
Usage:
-- Uses default n=10
{{ top_n('sales', 'revenue') }}
-- Override with n=5
{{ top_n('sales', 'revenue', 5) }}
Using Macros in Assets
Once defined in your macros/
folder, macros can be used in any SQL asset by calling them with the syntax.
Simple Usage
Example
macros/aggregations.sql
{% macro count_by(table, column) -%}
SELECT
{{ column }},
COUNT(*) as count
FROM {{ table }}
GROUP BY {{ column }}
ORDER BY count DESC
{%- endmacro %}
assets/country_summary.sql
/* @bruin
name: country_summary
type: duckdb.sql
materialization:
type: table
@bruin */
{{ count_by('customers', 'country') }}
This renders to:
SELECT
country,
COUNT(*) as count
FROM customers
GROUP BY country
ORDER BY count DESC
Using Macros with Bruin Variables
Macros work seamlessly with Bruin's built-in variables like start_date
and end_date
:
/* @bruin
name: daily_orders
type: duckdb.sql
@bruin */
{{ date_range_filter('orders', 'order_date', start_date, end_date) }}
Combining Macros with Jinja Features
You can use macros together with other Jinja features like loops and conditionals:
/* @bruin
name: multi_country_report
type: duckdb.sql
@bruin */
{% for country in ['USA', 'UK', 'Canada'] %}
{{ count_by('orders', 'product_id') }}
WHERE country = '{{ country }}'
{% if not loop.last %}UNION ALL{% endif %}
{% endfor %}
Practical Examples
Aggregation Patterns
Count By Column
{% macro count_by(table, column, order_by='count') -%}
SELECT
{{ column }},
COUNT(*) as count
FROM {{ table }}
GROUP BY {{ column }}
ORDER BY {{ order_by }} DESC
{%- endmacro %}
Usage:
{{ count_by('orders', 'customer_id') }}
Sum By Column
{% macro sum_by(table, group_column, sum_column) -%}
SELECT
{{ group_column }},
SUM({{ sum_column }}) as total
FROM {{ table }}
GROUP BY {{ group_column }}
ORDER BY total DESC
{%- endmacro %}
Usage:
{{ sum_by('orders', 'customer_id', 'order_amount') }}
Filtering Patterns
Recent Records
{% macro recent_records(table, date_column, days=7) -%}
SELECT *
FROM {{ table }}
WHERE {{ date_column }} >= CURRENT_DATE - INTERVAL '{{ days }} days'
{%- endmacro %}
Usage:
-- Get orders from last 7 days (default)
{{ recent_records('orders', 'order_date') }}
-- Get orders from last 30 days
{{ recent_records('orders', 'order_date', 30) }}
Filter Null Values
{% macro filter_null(table, columns) -%}
SELECT *
FROM {{ table }}
WHERE {% for col in columns %}
{{- col }} IS NOT NULL
{%- if not loop.last %} AND {% endif %}
{%- endfor %}
{%- endmacro %}
Usage:
{{ filter_null('customers', ['email', 'phone', 'address']) }}
Renders to:
SELECT *
FROM customers
WHERE email IS NOT NULL
AND phone IS NOT NULL
AND address IS NOT NULL
Filter by List of Values
{% macro in_list(table, column, values) -%}
SELECT *
FROM {{ table }}
WHERE {{ column }} IN (
{%- for val in values %}
'{{ val }}'
{%- if not loop.last %}, {% endif %}
{%- endfor %}
)
{%- endmacro %}
Usage:
{{ in_list('orders', 'status', ['pending', 'processing', 'shipped']) }}
Transformation Patterns
Deduplicate Records
{% macro deduplicate(table, partition_column, order_column) -%}
SELECT * FROM (
SELECT *,
ROW_NUMBER() OVER (
PARTITION BY {{ partition_column }}
ORDER BY {{ order_column }} DESC
) as rn
FROM {{ table }}
)
WHERE rn = 1
{%- endmacro %}
Usage:
-- Keep most recent record for each customer
{{ deduplicate('customer_events', 'customer_id', 'event_timestamp') }}
Generate Surrogate Key
{% macro generate_surrogate_key(columns) -%}
MD5(CONCAT_WS('||',
{%- for col in columns %}
CAST({{ col }} AS VARCHAR)
{%- if not loop.last %}, {% endif %}
{%- endfor %}
)) as surrogate_key
{%- endmacro %}
Usage:
SELECT
{{ generate_surrogate_key(['customer_id', 'order_id']) }},
*
FROM orders
Safe Division
{% macro safe_divide(numerator, denominator, default=0) -%}
CASE
WHEN {{ denominator }} = 0 OR {{ denominator }} IS NULL
THEN {{ default }}
ELSE {{ numerator }}::DOUBLE / {{ denominator }}::DOUBLE
END
{%- endmacro %}
Usage:
SELECT
product_id,
revenue,
units_sold,
{{ safe_divide('revenue', 'units_sold') }} as avg_price
FROM product_sales
Advanced Usage
Macro Composition
Macros can call other macros, allowing you to build more complex functionality:
Example
macros/base.sql
{% macro date_range_filter(table, date_column, start_date, end_date) -%}
SELECT *
FROM {{ table }}
WHERE {{ date_column }} >= '{{ start_date }}'
AND {{ date_column }} < '{{ end_date }}'
{%- endmacro %}
macros/aggregations.sql
{% macro daily_summary(table, date_column, start_date, end_date) -%}
WITH filtered AS (
{{ date_range_filter(table, date_column, start_date, end_date) }}
)
SELECT
{{ date_column }}::DATE as day,
COUNT(*) as total_count,
COUNT(DISTINCT customer_id) as unique_customers
FROM filtered
GROUP BY {{ date_column }}::DATE
ORDER BY day
{%- endmacro %}
assets/daily_orders.sql
/* @bruin
name: daily_orders
type: duckdb.sql
@bruin */
{{ daily_summary('orders', 'created_at', start_date, end_date) }}
Using Macros with Pipeline Variables
Combine macros with pipeline variables for powerful parameterization:
Example
pipeline.yml
name: sales_pipeline
variables:
regions:
type: array
default: ['US', 'EU', 'APAC']
min_revenue:
type: integer
default: 1000
macros/filters.sql
{% macro revenue_filter(table, min_amount) -%}
SELECT *
FROM {{ table }}
WHERE revenue >= {{ min_amount }}
{%- endmacro %}
assets/regional_sales.sql
/* @bruin
name: regional_sales
type: duckdb.sql
@bruin */
{% for region in var.regions %}
{{ revenue_filter('sales', var.min_revenue) }}
AND region = '{{ region }}'
{% if not loop.last %}UNION ALL{% endif %}
{% endfor %}
Dynamic Column Generation
Use macros to generate columns dynamically:
{% macro revenue_by_days(days_list) -%}
{% for day in days_list %}
SUM(CASE WHEN days_since_start <= {{ day }} THEN revenue ELSE 0 END) AS revenue_{{ day }}_days
{%- if not loop.last %},{% endif %}
{% endfor %}
{%- endmacro %}
Usage:
SELECT
cohort_date,
{{ revenue_by_days([7, 14, 30, 90]) }}
FROM user_cohorts
GROUP BY cohort_date
Complete End-to-End Example
Here's a full example showing how to structure and use macros in a pipeline:
Complete Example
sales-pipeline/
├── pipeline.yml
├── macros/
│ ├── aggregations.sql
│ └── filters.sql
└── assets/
├── daily_sales.sql
└── top_customers.sql
pipeline.yml
name: sales_pipeline
default_connections:
duckdb: "duckdb-default"
macros/aggregations.sql
{% macro count_by(table, column) -%}
SELECT
{{ column }},
COUNT(*) as count
FROM {{ table }}
GROUP BY {{ column }}
ORDER BY count DESC
{%- endmacro %}
{% macro sum_by(table, group_column, sum_column) -%}
SELECT
{{ group_column }},
SUM({{ sum_column }}) as total
FROM {{ table }}
GROUP BY {{ group_column }}
ORDER BY total DESC
{%- endmacro %}
macros/filters.sql
{% macro date_range(table, date_column, start_date, end_date) -%}
SELECT *
FROM {{ table }}
WHERE {{ date_column }} >= '{{ start_date }}'
AND {{ date_column }} < '{{ end_date }}'
{%- endmacro %}
assets/daily_sales.sql
/* @bruin
name: daily_sales
type: duckdb.sql
materialization:
type: table
@bruin */
WITH filtered_orders AS (
{{ date_range('raw_orders', 'order_date', start_date, end_date) }}
)
{{ sum_by('filtered_orders', 'order_date::DATE', 'order_amount') }}
assets/top_customers.sql
/* @bruin
name: top_customers
type: duckdb.sql
materialization:
type: table
depends:
- daily_sales
@bruin */
{{ sum_by('raw_orders', 'customer_id', 'order_amount') }}
LIMIT 100
To see the rendered output:
bruin render assets/daily_sales.sql
Best Practices
Organization
- Group related macros: Keep aggregations, filters, and transformations in separate files
- Use descriptive names:
calculate_revenue_by_region
is better thancalc1
- Document complex macros: Add comments explaining parameters and behavior
{#
Deduplicates records keeping the most recent one per partition.
Args:
table: The source table name
partition_column: Column to partition by (e.g., customer_id)
order_column: Column to order by (e.g., updated_at)
#}
{% macro deduplicate(table, partition_column, order_column) -%}
...
{%- endmacro %}
Commenting Macros
Important
SQL comments (--
) don't prevent Jinja from processing macros. To show example macro code without executing it, use {% raw %}
blocks:
-- ❌ Wrong: This will still execute the macro!
-- {{ top_n('orders', 'amount', 10) }}
-- ✅ Correct: Wrap in {% raw %} to show the code without executing
{% raw %}
-- {{ top_n('orders', 'amount', 10) }}
{% endraw %}
Everything inside {% raw %}
... {% endraw %}
is treated as literal text, allowing you to document macro examples without executing them.
Design
- Keep macros focused: Each macro should do one thing well
- Use default parameters: Make macros easier to use with sensible defaults
- Make macros reusable: Avoid hardcoding table names or column names
Testing
- Test with
bruin render
: Always preview the rendered output
bruin render assets/my_asset.sql
- Start simple: Build complex macros incrementally
- Verify rendered SQL: Ensure the output is valid SQL for your platform
Performance
- Avoid over-abstraction: Don't create macros for everything
- Consider inline code: Simple patterns might not need a macro
- Test performance: Some complex macros might impact query performance
Debugging Tips
Preview Macro Output
Use bruin render
to see exactly what your macros generate:
# Render a specific asset
bruin render assets/my_asset.sql
# See the full SQL that will be executed
bruin render assets/my_asset.sql --output json
Check Macro Loading
If your macros aren't working:
- Verify the
macros/
folder is at the pipeline root - Ensure macro files end with
.sql
- Check for syntax errors in macro definitions
- Verify macro names match what you're calling in assets
Common Issues
Extra whitespace in output: Use {%-
and -%}
to control whitespace
{# Good: strips whitespace #}
{% macro my_macro() -%}
SELECT 1
{%- endmacro %}
{# Bad: includes whitespace #}
{% macro my_macro() %}
SELECT 1
{% endmacro %}
Macro not found: Ensure the macro is defined before it's used (macro files are loaded in alphabetical order)
When to Use Macros
Use macros when:
- You repeat the same SQL pattern in multiple assets
- You want to standardize certain operations across your pipeline
- You need to abstract away complexity
- You want to make your queries more readable
Don't use macros when:
- The pattern is only used once
- The macro would be more complex than the inline code
- You need very dynamic behavior better suited to Python
- Performance is critical and the macro adds overhead
Related Topics
- Templating - Learn about Jinja templating basics
- Filters - Use filters to transform variables
- Pipeline Variables - Define custom variables
- SQL Assets - Learn about SQL assets in Bruin