All integrations
HTTP
+
Bruin

HTTP + Bruin

Source

Ingest HTTP data into your warehouse with incremental loading, quality checks, and full lineage. Defined in YAML, version-controlled in Git.

For business teams

What you get

  • API data, on schedule

    HTTP data lands in your warehouse automatically. No scripts to maintain, no pagination to handle.

  • Only fetch what changed

    Incremental sync means no re-processing. Bruin tracks watermarks so you only get new and updated records.

  • Catch API changes early

    Quality checks validate response data on every sync. Schema changes or missing fields get caught before they break models.

  • Transform in the same pipeline

    Reshape HTTP API data with SQL or Python. Compute metrics, normalize schemas, and build models, all version-controlled.

For data & engineering teams

How it works

  • Managed pagination & retries

    Bruin handles HTTP API pagination, rate limiting, and retries. You define the source, Bruin does the rest.

  • YAML-defined, Git-versioned

    Your HTTP pipeline is a YAML file. Review in PRs, deploy with CI/CD, roll back with git revert.

  • Incremental with watermarks

    Bruin tracks cursor positions and watermarks. Only new and updated HTTP records get fetched on each run.

  • Schema validation on responses

    Quality checks validate HTTP API response structure on every sync. Catch breaking API changes early.

Before you start

API endpoint URL
Authentication credentials (if required)

Step 1

Add your HTTP connection

Connect to any HTTP/REST API endpoint. Add this to your Bruin environment file, credentials are stored securely and referenced by name in your pipeline YAML.

Parameters

  • urlThe HTTP(S) endpoint URL
  • headersCustom headers as JSON (optional)
  • auth_typeAuthentication type (bearer, basic, etc.)
connections:
  http:
    type: http
    uri: "http://?url=<api_url>&headers=<headers>&auth_type=<auth_type>"

Step 2

Create your pipeline

Define a YAML asset that tells Bruin what to pull from HTTP and where to land it. This file lives in your Git repo, reviewable, version-controlled, and deployable with CI/CD.

name: raw.http_data
type: ingestr

parameters:
  source_connection: http
  source_table: 'data'
  destination: bigquery

Step 3

Add quality checks

Add column-level and custom SQL checks to your HTTP data. If a check fails, the pipeline stops, bad data never reaches downstream models or dashboards.

Validate API data freshness on every sync
Ensure record IDs are unique across fetches
Catch missing fields from API response changes
columns:
  - name: id
    checks:
      - name: not_null
      - name: unique
  - name: fetched_at
    checks:
      - name: not_null

custom_checks:
  - name: API data is fresh
    query: |
      SELECT MAX(fetched_at) >
        TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 24 HOUR)
      FROM raw.http_data

Step 4

Run it

One command. Bruin connects to HTTP, pulls data incrementally, runs your quality checks, and lands clean data in your warehouse. If a check fails, the pipeline stops, bad data never reaches downstream.

Backfill historical data with --start-date
Schedule with cron or trigger from CI/CD
Full lineage from HTTP to your dashboards
$ bruin run .
Running pipeline...

  http_data
    ✓ Fetched 2,847 new records
    ✓ Quality: campaign_id not_null     PASSED
    ✓ Quality: spend not_null           PASSED
    ✓ Quality: no negative ad spend     PASSED
    ✓ Loaded into bigquery

  Completed in 12s

Ready to connect HTTP?

Start for free, or book a demo to see how Bruin handles ingestion, quality, lineage, and scheduling for your entire data stack.