Ingest data
from anywhere

Ingest data from any source into your data lake or data warehouse, with no code required. Extend if needed with custom code.

Trusted by forward-thinking teams

name: raw.users
type: ingestr
parameters:
  source_connection: postgres
  source_table: 'public.users'
  destination: bigquery

Build data pipelines faster

Built-in connectors, defined with YAML

Bruin is a code-based platform, meaning that everything you do comes from a Git repo, versioned. All of the data ingestions are defined in code, version controlled in your repo.

Multiple platforms: Bruin supports quite a few platforms as built-in connectors. You can ingest data from AWS, Azure, GCP, Snowflake, Notion, and more.
Built on open-source: Bruin's ingestion engine is built on ingestr, an open-source data ingestion tool.
Custom sources & destinations: Bruin supports pure Python executions, enabling you to build your own data ingestion code.
Incremental loading: Bruin supports incremental loading, meaning that you can ingest only the new data, not the entire dataset every time.

Build safer

End-to-end quality in raw data

Bruin's built-in data quality capabilities are designed to ensure that the data you ingest is of the highest quality and always matches with your expectations.

Built-in quality checks: Bruin supports built-in quality checks, such as not_null, accepted_values, and more, all ready to be used in all assets.
Custom quality checks: Bruin allows you to define custom quality checks in SQL, enabling you to define your own quality standards.
Templating in quality checks: Bruin supports templating in quality checks, meaning that you can use variables in your checks, and run checks only for incremental periods.
Automated alerting: Failing quality checks will automatically send alerts to the configured channels, ensuring that you are always aware of the data quality issues.

name: raw.users
type: ingestr

parameters:
  source_connection: postgresql
  source_table: 'public.users'
  destination: bigquery

columns:

  # Define columns along with their quality checks
  - name: status
    checks:
      - name: not_null
      - name: accepted_values
        values:
          - active
          - inactive
          - deleted

# You can also define custom quality checks in SQL        
custom_checks:
  - name: new user count is greater than 1000
    query: |
      SELECT COUNT(*) > 1000 
      FROM raw.users 
      WHERE status = 'active' 
        AND created_at BETWEEN "{{start_date}}" AND "{{end_date}}"