Step 2
Beginner
3 min

Pipelines

Learn how pipelines organize your assets, define schedules, manage connections, and scope credentials within a Bruin project.

Bruin CLI
Learning paths:Data Engineer

Video

What is a pipeline?

A pipeline is a folder within your project that groups related assets together. Each pipeline has its own pipeline.yml configuration file that defines:

  • Name - a human-readable identifier
  • Schedule - a cron expression or shorthand like daily, monthly, hourly
  • Default connections - which connections from .bruin.yml this pipeline uses
  • Custom variables - pipeline-scoped variables you can inject into assets

Why separate pipelines?

Since each pipeline has a single schedule, the best practice is to group assets by their execution cadence. A pipeline also scopes which credentials are available - this is important for security in larger organizations where different teams should not access each other's database credentials.

The pipeline.yml file

name: nyc_taxi
schedule: daily
default_connections:
  duckdb: duckdb_default
variables:
  - name: taxi_types
    type: array
    default:
      - yellow

Key points

  • Each pipeline lives in its own folder with a pipeline.yml
  • Only connections explicitly listed in default_connections are initialized at runtime - this avoids exposing unnecessary credentials
  • Custom variables can be defined with defaults and overridden at runtime
  • Assets live inside an assets/ subfolder within the pipeline directory