All integrations
HTTP
+
Bruin

HTTP + Bruin

Source

Ingest HTTP data into your warehouse with incremental loading, quality checks, and full lineage. Defined in YAML, version-controlled in Git.

For business teams

What you get

  • API data, on schedule

    HTTP data lands in your warehouse automatically. No scripts to maintain, no pagination to handle.

  • Only fetch what changed

    Incremental sync means no re-processing. Bruin tracks watermarks so you only get new and updated records.

  • Catch API changes early

    Quality checks validate response data on every sync. Schema changes or missing fields get caught before they break models.

  • Transform in the same pipeline

    Reshape HTTP API data with SQL or Python. Compute metrics, normalize schemas, and build models — all version-controlled.

For data & engineering teams

How it works

  • Managed pagination & retries

    Bruin handles HTTP API pagination, rate limiting, and retries. You define the source — Bruin does the rest.

  • YAML-defined, Git-versioned

    Your HTTP pipeline is a YAML file. Review in PRs, deploy with CI/CD, roll back with git revert.

  • Incremental with watermarks

    Bruin tracks cursor positions and watermarks. Only new and updated HTTP records get fetched on each run.

  • Schema validation on responses

    Quality checks validate HTTP API response structure on every sync. Catch breaking API changes early.

Before you start

API endpoint URL
Authentication credentials (if required)

Step 1

Add your HTTP connection

Connect to any HTTP/REST API endpoint. Add this to your Bruin environment file — credentials are stored securely and referenced by name in your pipeline YAML.

Parameters

  • urlThe HTTP(S) endpoint URL
  • headersCustom headers as JSON (optional)
  • auth_typeAuthentication type (bearer, basic, etc.)
connections:
  http:
    type: http
    uri: "http://?url=<api_url>&headers=<headers>&auth_type=<auth_type>"

Step 2

Create your pipeline

Define a YAML asset that tells Bruin what to pull from HTTP and where to land it. This file lives in your Git repo — reviewable, version-controlled, and deployable with CI/CD.

name: raw.http_data
type: ingestr

parameters:
  source_connection: http
  source_table: 'data'
  destination: bigquery

Step 3

Add quality checks

Add column-level and custom SQL checks to your HTTP data. If a check fails, the pipeline stops — bad data never reaches downstream models or dashboards.

Validate API data freshness on every sync
Ensure record IDs are unique across fetches
Catch missing fields from API response changes
columns:
  - name: id
    checks:
      - name: not_null
      - name: unique
  - name: fetched_at
    checks:
      - name: not_null

custom_checks:
  - name: API data is fresh
    query: |
      SELECT MAX(fetched_at) >
        TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 24 HOUR)
      FROM raw.http_data

Step 4

Run it

One command. Bruin connects to HTTP, pulls data incrementally, runs your quality checks, and lands clean data in your warehouse. If a check fails, the pipeline stops — bad data never reaches downstream.

Backfill historical data with --start-date
Schedule with cron or trigger from CI/CD
Full lineage from HTTP to your dashboards
$ bruin run .
Running pipeline...

  http_data
    ✓ Fetched 2,847 new records
    ✓ Quality: campaign_id not_null     PASSED
    ✓ Quality: spend not_null           PASSED
    ✓ Quality: no negative ad spend     PASSED
    ✓ Loaded into bigquery

  Completed in 12s

Ready to connect HTTP?

Start for free, or book a demo to see how Bruin handles ingestion, quality, lineage, and scheduling for your entire data stack.