Ingest Google Ads, Klaviyo, and GA4
Have the AI agent write ingestr assets for three marketing data sources and pull ad spend, email engagement, and web analytics into DuckDB.
What you'll do
Get credentials for Google Ads, Klaviyo, and GA4, add them to your project config, and have the AI agent scaffold three ingestr assets. Then run the pipeline once to populate DuckDB.
Why this step matters
This is the entire data-import layer of a marketing analyst's stack, in one step:
- Google Ads tells you what you paid to acquire traffic - campaigns, clicks, cost, conversions
- Klaviyo tells you what happens after you have someone's email - engagement, flow performance, revenue attribution
- GA4 tells you what they did on your site - sessions, source/medium, conversion events
Most marketing dashboards show each of these in isolation. With all three in the same database you can finally ask cross-channel questions - which is where the real insight lives (and what your vendor BI tool usually won't let you do without a custom connector).
1. Gather credentials for each source
Google Ads
You need three things:
- Developer token - from your Google Ads Manager account: Tools → API Center. Apply for one if you don't have it (usually approved same day for "test account" tier).
- Customer ID - the 10-digit number shown in the top-right of the Google Ads UI (remove the dashes).
- OAuth credentials - create a client ID + secret in Google Cloud Console, then follow the ingestr Google Ads guide to get a refresh token.
Shortcut: if this is your first API integration, ask your AI agent: "Walk me through getting a Google Ads refresh token for ingestr, step by step." It knows the whole flow.
2. Add the source connections
Prompt the agent:
Add three new connections to .bruin.yml under environments.default.connections:
- A
google_adsconnection namedgoogle-ads-defaultwith the developer token, customer ID, client ID, client secret, and refresh token I'll paste below - A
klaviyoconnection namedklaviyo-defaultwith my Klaviyo private API key - A
google_analyticsconnection namedga4-defaultwith my GA4 property ID and path to the service account JSON file
Use the ingestr docs at https://bruin-data.github.io/ingestr/supported-sources.html for the exact field names per source. Then run bruin connections test for each and show me the output.
paste your credentials here - or point the agent at where they're stored
Don't want to hand the agent your keys? Ask it instead to scaffold the connections in
.bruin.ymlwith placeholder values (e.g.<YOUR_GOOGLE_ADS_DEV_TOKEN>,<YOUR_KLAVIYO_KEY>,<YOUR_GA4_PROPERTY_ID>). Then open.bruin.ymlyourself and paste the real keys in. The file is git-ignored by default, so they stay local.
When the agent finishes, .bruin.yml will have four connections total: your DuckDB destination from Step 2 plus the three new sources.
Security note.
.bruin.ymlis added to.gitignoreby default - your keys won't end up in git. If you're sharing the project, use environment variables instead; the docs show how.
3. Have the agent write three ingestr assets
Now the good part. Prompt:
In marketing-analyst-101/assets/, create three ingestr assets using the ingestr documentation as reference:
google_ads_campaigns.asset.yml - pull campaign performance (campaign_id, campaign_name, date, impressions, clicks, cost_micros, conversions, conversion_value). Pull the last 90 days. Destination: duckdb-default, schema raw, table google_ads_campaigns. Use merge strategy on campaign_id + date.
klaviyo_campaigns.asset.yml - pull campaign-level email metrics (campaign_id, send_time, recipients, opens, clicks, unsubscribes, revenue). Pull the last 90 days. Destination: duckdb-default, schema raw, table klaviyo_campaigns. Use merge strategy on campaign_id.
ga4_traffic.asset.yml - pull daily sessions by source/medium (date, sessionSource, sessionMedium, sessions, totalUsers, conversions, purchaseRevenue). Pull the last 90 days. Destination: duckdb-default, schema raw, table ga4_traffic. Use merge strategy on date + sessionSource + sessionMedium.
Before writing the files, show me each YAML and explain what it does. Don't run anything yet.
The agent will produce three YAML files. Read them - they should look roughly like:
name: raw.google_ads_campaigns
type: ingestr
parameters:
source_connection: google-ads-default
source_table: "campaign"
destination: duckdb
destination_connection: duckdb-default
incremental_strategy: merge
incremental_key: "campaign_id,date"
interval_start: "2026-01-24"
Field names vary by source - the ingestr docs and the Shopify Data Pipeline ingest step show real working examples for all three. Trust what the agent writes but glance at the YAML to make sure the destination, schema, and table name match what you asked for.
4. Run the pipeline
One more prompt:
Run bruin run marketing-analyst-101/ and stream the output. Tell me which assets succeeded and any that failed.
This kicks off all three ingestions in parallel. First run typically takes 3–8 minutes depending on how much history each source returns. When it finishes, you'll have three tables in raw.* inside marketing.duckdb.
5. Peek at what landed
Prompt:
Using bruin query, show me:
SELECT COUNT(*), MIN(date), MAX(date), SUM(cost_micros)/1e6 AS total_cost_usd FROM raw.google_ads_campaignsSELECT COUNT(*), SUM(recipients), SUM(opens), SUM(revenue) FROM raw.klaviyo_campaignsSELECT COUNT(*), SUM(sessions), SUM(conversions) FROM raw.ga4_traffic
You should see ~90 days of data in each table. Sanity-check the totals against what you see in the native UI of each tool - they should match within a few percent (small discrepancies are normal due to timezone rounding).
Troubleshooting
- Google Ads "Customer not found" - the Customer ID needs no dashes.
123-456-7890becomes1234567890. - Klaviyo "401 Unauthorized" - private API keys have scopes attached; make sure you granted reads for Campaigns, Flows, Metrics, Profiles.
- GA4 "Permission denied" - the service account needs Viewer access on the GA4 property itself (not just the Google Cloud project).
- DuckDB file locked - close any DuckDB desktop app that might have the file open.
What just happened
You just built a three-source marketing data stack on a laptop, with zero SQL written by you. Every byte of it lives in marketing.duckdb - one file, portable, inspectable with any SQL tool. Next step: teach the AI agent what these tables mean so it can reason about attribution and CAC accurately.