Open-source
data tools.
Developer-first CLIs for ingestion and pipelines. Build locally, run anywhere.
Data Ingestion
Multi-Language
Pipeline Orchestration
Quality Checks
Bruin MCP
AI Development
PIPELINES & LINEAGE
Build end-to-end pipelines
Transform your data using SQL, Python, or R. Bruin CLI automatically extracts dependencies and column-level lineage from your code, building a complete view of your data flow.
- Multi-language support.
- SQL, Python, and custom scripts all in one pipeline.
- Automatic dependency resolution.
- No manual DAG configuration—dependencies extracted from code.
- Column-level lineage.
- Track data from source to destination at the column level.
/* @bruinname: dashboard.bookings
owner: [email protected]
materialization:
type: table@bruin */SELECT
bookings.Id AS BookingId,
sessions.Name AS SessionName,
bookings.SessionType AS SessionType
FROM raw.Bookings AS bookings
INNER JOIN raw.Sessions AS sessions
ON bookings.SessionId = sessions.Id
WHERE updated_at BETWEEN '{{ start_date }}' AND '{{ end_date }}'
DATA INGESTION
Move data from any source
Copy data between databases, apps, and data warehouses with a single command. Ingestr automatically handles data updates and keeps everything in sync.
- Multiple sources & destinations.
- Postgres, MySQL, MongoDB, BigQuery, Snowflake, Shopify, Stripe, Salesforce, and more.
- Incremental loading.
- Efficient data syncing with snapshot, incremental, and CDC patterns.
- Schema evolution.
- Automatically update schemas in the destination to match the source.
- CLI-friendly & scriptable.
- Use in bash scripts, CI pipelines, or integrate with Bruin CLI workflows.
name: raw.users
type: ingestr
parameters:
source_connection: postgres
source_table: 'public.users'
destination: bigqueryORCHESTRATION
Run pipelines on schedule
Define pipeline schedules, variables, and connections in YAML. Set up cron expressions, type-safe parameters, and environment-specific configurations—all in one file.
- Flexible scheduling.
- Daily, hourly, or custom cron expressions for automated pipeline runs.
- Typed variables.
- Define variables with type validation, enums, and default values.
- Default connections.
- Centralize connection configs and reference them across all assets.
name: analytics-daily
schedule: daily
start_date: "2024-01-01"
default_connections:
snowflake: "sf-default"
postgres: "pg-default"
slack: "alerts-slack"
tags: [ "daily", "analytics" ]
domains: [ "marketing" ]
default:
interval_modifiers:
start: "-1d"
end: "-1d"
variables:
target_segment:
type: string
enum: ["self_serve", "enterprise", "partner"]
default: "enterprise"
channel_overrides:
type: object
properties:
email:
type: array
items:
type: string
default:
email: ["enterprise_newsletter"]QUALITY CHECKS
Catch issues before production
Define data quality checks alongside your transformations. Use built-in validators or write custom SQL checks for business rules. Tests run automatically and fail fast when expectations aren't met.
- Built-in checks.
- Pre-built checks for common patterns: not_null, unique, accepted_values, and more.
- Custom SQL validators.
- Write your own validation logic in SQL for complex business rules.
- Column-level tests.
- Define checks at the column level to ensure data quality where it matters.
name: raw.users
type: ingestr
parameters:
source_connection: postgresql
source_table: 'public.users'
destination: bigquery
columns:
# Define columns along with their quality checks
- name: status
checks:
- name: not_null
- name: accepted_values
values:
- active
- inactive
- deleted
# You can also define custom quality checks in SQL
custom_checks:
- name: new user count is greater than 1000
query: |
SELECT COUNT(*) > 1000
FROM raw.users
WHERE status = 'active'
AND created_at BETWEEN "{{start_date}}" AND "{{end_date}}"GET STARTED
Install in seconds
Both tools are available on macOS, Linux, and Windows. Install via package managers or download directly from GitHub.
Bruin CLI
Bruin CLI is the tool for building data pipelines and transformations in SQL & Python
curl:
curl -LsSf https://getbruin.com/install/cli | shwget:
wget -qO- https://getbruin.com/install/cli | shIngestr
Ingestr is the tool for copying data between databases, apps, and warehouses
uv is the recommended way to install Ingestr:
uv tool install ingestrDon't have uv? Install it first:
pip install uvReady to get started?
Join our Slack community to connect with other users, get help, and stay updated with the latest developments.