All integrations
Couchbase
+
Bruin

Couchbase + Bruin

Source

Ingest Couchbase data into your warehouse with incremental loading, quality checks, and full lineage. Defined in YAML, version-controlled in Git.

For business teams

What you get

  • Real-time warehouse sync

    Couchbase tables replicate to your warehouse continuously. Analytics teams work with fresh data, not yesterday's export.

  • Catch issues at the source

    Quality checks validate Couchbase data as it replicates. Null IDs, duplicate records, and schema drift get caught early.

  • Multi-source joins

    Combine Couchbase with SaaS data, APIs, and other databases in your warehouse. One Bruin pipeline handles it all.

  • No untracked scripts

    Replication is defined in YAML, reviewed in PRs, and deployed with CI/CD. No more mystery cron jobs.

For data & engineering teams

How it works

  • CDC with merge strategy

    Bruin handles change data capture from Couchbase with deduplication. Schema changes are detected and handled automatically.

  • YAML-defined, Git-versioned

    Your Couchbase replication is a YAML file. Review in PRs, deploy with CI/CD. No more untracked database scripts.

  • Row-level quality checks

    Validate primary keys, foreign keys, and referential integrity on every sync. Catch corruption at the source.

  • Multi-source pipelines

    Combine Couchbase with SaaS APIs and other databases in one pipeline. Bruin resolves cross-source dependencies.

Before you start

Couchbase cluster or Capella account
Database user credentials
Bucket access permissions

Step 1

Add your Couchbase connection

Connect using Couchbase credentials and cluster configuration. Add this to your Bruin environment file — credentials are stored securely and referenced by name in your pipeline YAML.

Parameters

  • usernameCouchbase username with cluster access
  • passwordPassword for authentication
  • hostCouchbase server host address
  • bucketBucket name (optional)
  • sslEnable SSL for cloud deployments (true for Capella, false for self-hosted)
connections:
  couchbase:
    type: couchbase
    uri: "couchbase://?username=<username>&password=<password>&host=<host>&bucket=<bucket>&ssl=<true|false>"

Step 2

Create your pipeline

Define a YAML asset that tells Bruin what to pull from Couchbase and where to land it. This file lives in your Git repo — reviewable, version-controlled, and deployable with CI/CD.

Available tables

bucket.scope.collectionbucket._default._default
name: raw.couchbase_bucket.scope.collection
type: ingestr

parameters:
  source_connection: couchbase
  source_table: 'bucket.scope.collection'
  destination: bigquery

Step 3

Add quality checks

Add column-level and custom SQL checks to your Couchbase data. If a check fails, the pipeline stops — bad data never reaches downstream models or dashboards.

Validate row counts are within expected range
Ensure primary keys are unique and not null
Catch schema drift with freshness checks
columns:
  - name: id
    checks:
      - name: not_null
      - name: unique
  - name: created_at
    checks:
      - name: not_null

custom_checks:
  - name: row count within expected range
    query: |
      SELECT COUNT(*) BETWEEN 1 AND 10000000
      FROM raw.couchbase_bucket.scope.collection

Step 4

Run it

One command. Bruin connects to Couchbase, pulls data incrementally, runs your quality checks, and lands clean data in your warehouse. If a check fails, the pipeline stops — bad data never reaches downstream.

Backfill historical data with --start-date
Schedule with cron or trigger from CI/CD
Full lineage from Couchbase to your dashboards
$ bruin run .
Running pipeline...

  couchbase_bucket.scope.collection
    ✓ Fetched 2,847 new records
    ✓ Quality: campaign_id not_null     PASSED
    ✓ Quality: spend not_null           PASSED
    ✓ Quality: no negative ad spend     PASSED
    ✓ Loaded into bigquery

  Completed in 12s

Ready to connect Couchbase?

Start for free, or book a demo to see how Bruin handles ingestion, quality, lineage, and scheduling for your entire data stack.