Chess Data to DuckDB
Build your first Bruin pipeline by ingesting chess API data and storing it in DuckDB - no credentials needed.
What is this? A beginner-friendly tutorial where you build a complete data pipeline that pulls chess game data from a public API and loads it into DuckDB for analysis. No API keys or database credentials needed - the chess API is completely open and DuckDB runs locally.
What you'll learn: How to initialize a Bruin project from a template, configure environments and connections, understand asset types (ingestr and SQL), and run a pipeline end-to-end.
What you'll build: A pipeline that ingests chess games and player profiles for top grandmasters, then creates a summary table with player statistics including total games and win rates.
Full tutorial
Below is the complete tutorial you can read through, or use the step-by-step version above.
Initialize the project
Run the following command to scaffold a new project using the built-in chess template:
bruin init chess
This creates a folder structure with pre-configured assets and pipeline files:
chess/
├── assets/
│ ├── chess_games.asset.yml
│ ├── chess_profiles.asset.yml
│ └── player_summary.sql
├── .bruin.yml
├── pipeline.yml
└── .gitignore
Configure the environment
Open .bruin.yml and configure your environment with DuckDB and Chess API connections. Specify the list of chess players you want to track:
environments:
default:
connections:
duckdb:
- name: "duckdb-default"
path: "chess.db"
chess:
- name: "chess-default"
players:
- "FabianoCaruana"
- "Hikaru"
- "MagnusCarlsen"
- "GarryKasparov"
- "Firouzja2003"
Review the assets
The template includes three pre-configured assets:
- chess_games.asset.yml — An ingestr asset that fetches game data for each player from the Chess.com API.
- chess_profiles.asset.yml — An ingestr asset that fetches player profile information.
- player_summary.sql — A SQL asset that joins games and profiles to create a summary table with statistics like total games and win rates.
Examine the pipeline
The pipeline.yml file defines the pipeline name and default connections:
name: chess
default_connections:
duckdb: "duckdb-default"
chess: "chess-default"
Run the pipeline
Execute the pipeline to ingest data:
bruin run ./chess/pipeline.yml
Query the results
Once the pipeline completes, query the results to verify everything worked:
bruin query --c duckdb-default --q "SELECT * FROM chess_playground.player_summary LIMIT 10;"
You should see a table with player statistics including usernames, total games, wins, losses, and win rates.

Before you start
- Bruin CLI installed