Step 3-14 min

Ingest Google Ads, Klaviyo, and GA4

Have the AI agent write ingestr assets for three marketing data sources and pull ad spend, email engagement, and web analytics into DuckDB.

What you'll do

Get credentials for Google Ads, Klaviyo, and GA4, add them to your project config, and have the AI agent scaffold three ingestr assets. Then run the pipeline once to populate DuckDB.

Why this step matters

This is the entire data-import layer of a marketing analyst's stack, in one step:

  • Google Ads tells you what you paid to acquire traffic - campaigns, clicks, cost, conversions
  • Klaviyo tells you what happens after you have someone's email - engagement, flow performance, revenue attribution
  • GA4 tells you what they did on your site - sessions, source/medium, conversion events

Most marketing dashboards show each of these in isolation. With all three in the same database you can finally ask cross-channel questions - which is where the real insight lives (and what your vendor BI tool usually won't let you do without a custom connector).

1. Gather credentials for each source

You need three things:

  • Developer token - from your Google Ads Manager account: Tools → API Center. Apply for one if you don't have it (usually approved same day for "test account" tier).
  • Customer ID - the 10-digit number shown in the top-right of the Google Ads UI (remove the dashes).
  • OAuth credentials - create a client ID + secret in Google Cloud Console, then follow the ingestr Google Ads guide to get a refresh token.

Shortcut: if this is your first API integration, ask your AI agent: "Walk me through getting a Google Ads refresh token for ingestr, step by step." It knows the whole flow.

2. Add the source connections

Prompt the agent:

AI Prompt

Add three new connections to .bruin.yml under environments.default.connections:

  1. A google_ads connection named google-ads-default with the developer token, customer ID, client ID, client secret, and refresh token I'll paste below
  2. A klaviyo connection named klaviyo-default with my Klaviyo private API key
  3. A google_analytics connection named ga4-default with my GA4 property ID and path to the service account JSON file

Use the ingestr docs at https://bruin-data.github.io/ingestr/supported-sources.html for the exact field names per source. Then run bruin connections test for each and show me the output.

paste your credentials here - or point the agent at where they're stored

Don't want to hand the agent your keys? Ask it instead to scaffold the connections in .bruin.yml with placeholder values (e.g. <YOUR_GOOGLE_ADS_DEV_TOKEN>, <YOUR_KLAVIYO_KEY>, <YOUR_GA4_PROPERTY_ID>). Then open .bruin.yml yourself and paste the real keys in. The file is git-ignored by default, so they stay local.

When the agent finishes, .bruin.yml will have four connections total: your DuckDB destination from Step 2 plus the three new sources.

Security note. .bruin.yml is added to .gitignore by default - your keys won't end up in git. If you're sharing the project, use environment variables instead; the docs show how.

3. Have the agent write three ingestr assets

Now the good part. Prompt:

AI Prompt

In marketing-analyst-101/assets/, create three ingestr assets using the ingestr documentation as reference:

google_ads_campaigns.asset.yml - pull campaign performance (campaign_id, campaign_name, date, impressions, clicks, cost_micros, conversions, conversion_value). Pull the last 90 days. Destination: duckdb-default, schema raw, table google_ads_campaigns. Use merge strategy on campaign_id + date.

klaviyo_campaigns.asset.yml - pull campaign-level email metrics (campaign_id, send_time, recipients, opens, clicks, unsubscribes, revenue). Pull the last 90 days. Destination: duckdb-default, schema raw, table klaviyo_campaigns. Use merge strategy on campaign_id.

ga4_traffic.asset.yml - pull daily sessions by source/medium (date, sessionSource, sessionMedium, sessions, totalUsers, conversions, purchaseRevenue). Pull the last 90 days. Destination: duckdb-default, schema raw, table ga4_traffic. Use merge strategy on date + sessionSource + sessionMedium.

Before writing the files, show me each YAML and explain what it does. Don't run anything yet.

The agent will produce three YAML files. Read them - they should look roughly like:

name: raw.google_ads_campaigns
type: ingestr
parameters:
  source_connection: google-ads-default
  source_table: "campaign"
  destination: duckdb
  destination_connection: duckdb-default
  incremental_strategy: merge
  incremental_key: "campaign_id,date"
  interval_start: "2026-01-24"

Field names vary by source - the ingestr docs and the Shopify Data Pipeline ingest step show real working examples for all three. Trust what the agent writes but glance at the YAML to make sure the destination, schema, and table name match what you asked for.

4. Run the pipeline

One more prompt:

AI Prompt

Run bruin run marketing-analyst-101/ and stream the output. Tell me which assets succeeded and any that failed.

This kicks off all three ingestions in parallel. First run typically takes 3–8 minutes depending on how much history each source returns. When it finishes, you'll have three tables in raw.* inside marketing.duckdb.

5. Peek at what landed

Prompt:

AI Prompt

Using bruin query, show me:

  1. SELECT COUNT(*), MIN(date), MAX(date), SUM(cost_micros)/1e6 AS total_cost_usd FROM raw.google_ads_campaigns
  2. SELECT COUNT(*), SUM(recipients), SUM(opens), SUM(revenue) FROM raw.klaviyo_campaigns
  3. SELECT COUNT(*), SUM(sessions), SUM(conversions) FROM raw.ga4_traffic

You should see ~90 days of data in each table. Sanity-check the totals against what you see in the native UI of each tool - they should match within a few percent (small discrepancies are normal due to timezone rounding).

Troubleshooting

  • Google Ads "Customer not found" - the Customer ID needs no dashes. 123-456-7890 becomes 1234567890.
  • Klaviyo "401 Unauthorized" - private API keys have scopes attached; make sure you granted reads for Campaigns, Flows, Metrics, Profiles.
  • GA4 "Permission denied" - the service account needs Viewer access on the GA4 property itself (not just the Google Cloud project).
  • DuckDB file locked - close any DuckDB desktop app that might have the file open.

What just happened

You just built a three-source marketing data stack on a laptop, with zero SQL written by you. Every byte of it lives in marketing.duckdb - one file, portable, inspectable with any SQL tool. Next step: teach the AI agent what these tables mean so it can reason about attribution and CAC accurately.