Explore Example Project
This tutorial ships as a single Bruin project with two pipelines side by side:
chess-basic/— a minimal "hello world" pipeline: connections, onepipeline.yml, a couple of ingestr assets, and a single SQL report. A great first look at Bruin.chess-advance/— the same chess dataset expanded into a tour of Bruin's features: sensors, seeds, Python materialization, incremental SQL, the Python SDK, column & custom checks, custom variables, scheduling, notifications, and dashboards.
Both pipelines share the root .bruin.yml (connections) and .gitignore. Browse the tree on the left to explore each file.
chess-project/
.bruin.yml
default_environment: default
environments:
default:
connections:
duckdb:
- name: "duckdb-default"
path: "duckdb.db"
chess:
- name: "chess-default"
players:
- "FabianoCaruana"
- "Hikaru"
- "MagnusCarlsen"
- "GothamChess"
- "DanielNaroditsky"
- "AnishGiri"
- "Firouzja2003"
- "LevonAronian"
- "WesleySo"
- "GarryKasparov"What's inside chess-basic/
pipeline.yml—name: chess_basic, withtype: ingestras the default asset type.assets/raw/games.asset.yml,profiles.asset.yml— Ingestr assets that pull the Chess.comgamesandprofilesendpoints into DuckDB.assets/reports/player_summary.sql— A single SQL report that joins and aggregates into a player-level win-rate table with one column check.
That's it — enough to run an end-to-end ingest + transform flow locally.
What's inside chess-advance/
pipeline.yml—name: chess_advance, plus pipeline-levelschedule,retries,concurrency,notifications,tags,default_connections, and custom variables (min_games,rating_category).assets/raw/— Raw landing layer: ingestion and seeds.ingestr_games.asset.yml,ingestr_profiles.asset.yml— Ingestr assets that inherittype: ingestrfrom pipeline defaults.seed_top_players.asset.yml(+seed_top_players.csv) —duckdb.seedloading static CSV data, with anaccepted_valuescolumn check.
assets/sensor/— Readiness gates.sensor_games_ready.asset.yml—duckdb.sensor.querythat polls until games have landed before downstream work runs.
assets/transformations/— Intermediate processing.python_materialization_player_ratings.py— Python asset withmaterialize()that returns a DataFrame to be written as a table; also reads therating_categorycustom variable fromBRUIN_VARS.incremental_sql_daily_games.sql— Incremental SQL (delete+insertkeyed ongame_date) showcasing the built-in{{ start_date }}/{{ end_date }}variables.
assets/reports/— Consumer-facing outputs.sql_with_checks_player_summary.sql— Aggregate table demonstrating column checks (not_null,unique,positive,min,max),custom_checkswithvalue/blocking, and the{{ var.min_games }}custom variable.python_sdk_rating_insights.py— Bruin Python SDK + Python materialization combined. Usesfrom bruin import query, contextto pull data from the warehouse and returns the enriched DataFrame frommaterialize()so Bruin writes it back as a table. Reads both built-in (context.start_date) and custom (context.vars["min_games"]) variables.dashboard_chess_overview.asset.yml—tableaudashboard asset that appears as a terminal node in lineage.
Setup
Fill in .bruin.yml with your connections. You can read more about connections here.
Both pipelines share the pre-configured list of 10 popular Chess.com players. You can modify the players list in your chess connection to track different players.
Running a Pipeline
Run a whole pipeline by pointing at its folder:
shell
bruin run chess-basic
# or
bruin run chess-advanceOr run a single asset:
shell
bruin run chess-advance/assets/reports/sql_with_checks_player_summary.sqlOverride a custom variable for a single run (chess-advance only):
shell
bruin run --var min_games=250 --var rating_category=blitz chess-advanceYou can optionally pass a --downstream flag to run an asset with all of its downstreams.