All integrations
Indeed
+
Bruin

Indeed + Bruin

Source

Ingest Indeed data into your warehouse with incremental loading, quality checks, and full lineage. Defined in YAML, version-controlled in Git.

For business teams

What you get

  • People analytics beyond HR tools

    Join Indeed data with finance and project data. See fully-loaded team cost, hiring ROI, and attrition trends.

  • Headcount planning with real data

    Combine Indeed org data with budget and project data. Plan headcount based on actual numbers, not estimates.

  • Compliance-ready data

    Quality checks validate that required fields are present, records are consistent, and org hierarchy is valid.

  • Faster reporting cycles

    Indeed data syncs automatically. HR and finance get fresh data without waiting for someone to pull a report.

For data & engineering teams

How it works

  • Automatic schema handling

    Bruin detects Indeed schema changes and handles them automatically. No manual migration scripts.

  • YAML-defined, Git-versioned

    Your Indeed pipeline is a YAML file. Review in PRs, deploy with CI/CD, roll back with git revert.

  • Hierarchy validation

    Custom SQL checks validate manager-employee relationships and catch orphaned records in Indeed org data.

  • Incremental sync

    Only sync new and changed Indeed records. Full org structure stays in sync without re-processing everything.

Before you start

Indeed Developer account
OAuth credentials (client_id, client_secret)
Employer ID from Indeed

Step 1

Add your Indeed connection

Connect using Indeed OAuth credentials and employer ID. Add this to your Bruin environment file, credentials are stored securely and referenced by name in your pipeline YAML.

Parameters

  • client_idOAuth client ID for Indeed API authentication
  • client_secretOAuth client secret for Indeed API authentication
  • employer_idThe employer ID associated with your Indeed account
connections:
  indeed:
    type: indeed
    uri: "indeed://?client_id=<client_id>&client_secret=<client_secret>&employer_id=<employer_id>"

Step 2

Create your pipeline

Define a YAML asset that tells Bruin what to pull from Indeed and where to land it. This file lives in your Git repo, reviewable, version-controlled, and deployable with CI/CD.

Available tables

campaignscampaign_detailscampaign_budgetcampaign_jobscampaign_propertiescampaign_statsaccounttraffic_stats
name: raw.indeed_campaigns
type: ingestr

parameters:
  source_connection: indeed
  source_table: 'campaigns'
  destination: bigquery

Step 3

Add quality checks

Add column-level and custom SQL checks to your Indeed data. If a check fails, the pipeline stops, bad data never reaches downstream models or dashboards.

Validate manager-employee hierarchy is valid
Catch employees with null departments
Ensure employee IDs are unique across syncs
columns:
  - name: employee_id
    checks:
      - name: not_null
      - name: unique
  - name: status
    checks:
      - name: accepted_values
        value: ['active', 'inactive', 'terminated', 'on_leave']

custom_checks:
  - name: valid manager hierarchy
    query: |
      SELECT COUNT(*) = 0
      FROM raw.indeed_campaigns
      WHERE manager_id IS NOT NULL
        AND manager_id NOT IN (SELECT employee_id FROM raw.indeed_campaigns)

Step 4

Run it

One command. Bruin connects to Indeed, pulls data incrementally, runs your quality checks, and lands clean data in your warehouse. If a check fails, the pipeline stops, bad data never reaches downstream.

Backfill historical data with --start-date
Schedule with cron or trigger from CI/CD
Full lineage from Indeed to your dashboards
$ bruin run .
Running pipeline...

  indeed_campaigns
    ✓ Fetched 2,847 new records
    ✓ Quality: campaign_id not_null     PASSED
    ✓ Quality: spend not_null           PASSED
    ✓ Quality: no negative ad spend     PASSED
    ✓ Loaded into bigquery

  Completed in 12s

Other HR & Recruiting integrations

Ready to connect Indeed?

Start for free, or book a demo to see how Bruin handles ingestion, quality, lineage, and scheduling for your entire data stack.