Exporting Adjust Raw Data to Google Cloud Storage
Every time we onboard a new company at Bruin, or every time an existing company ships a new game, someone has to set up Adjust's raw data export to GCS. The steps don't change. The pitfalls don't either. So here's the version I wish I could just send people.
Before any of the steps below: if you don't already have a unified GCP project for data (something like {company}-data), create one. The Adjust bucket, the BigQuery warehouse, and the dashboards on top all belong in there together. Most companies ship more than one game eventually, and a single data project means cross-game reporting is a regular SQL query instead of a cross-project IAM exercise. If you already have one, use that.
The Four Steps
1. Create a service account in GCloud. Adjust will use this to write files into your bucket.
- In the GCP Console, go to IAM & Admin → Service Accounts → Create service account.
- Give it a name like
adjust-export. Skip the optional "grant access" step, you'll grant access on the bucket directly in step 2. - After it's created, open the service account, go to the Keys tab, and click Add Key → Create new key → JSON. The browser downloads a JSON file. Hold onto it, you'll paste its contents in step 4.
In GCloud, one service account per company is enough. If you've already done this for a previous game at the same company, reuse it, you don't need a new one per game. Create a new one only if you want each game's data isolated from the others.
2. Create a GCS bucket and grant Storage Admin to that service account. The naming convention we use is {company_name}-adjust-{game}. One bucket per game, because Adjust writes everything to the bucket root, there is no subfolder option. That also means when you have three games and need to wipe one, you don't take the others with it.
- In the GCP Console, go to Cloud Storage → Buckets → Create.
- Name it using the convention above. Region: pick one close to where your downstream warehouse lives (BigQuery dataset region is the usual answer). Storage class: Nearline. Access control: Uniform. Leave public access prevention on.
- Open the bucket, go to the Permissions tab, and click Grant access. Paste the service account email from step 1 as the principal, assign the role Storage Admin, and save.
Why Nearline instead of Standard?
This bucket gets read once. Bruin (or whatever pipeline you've wired up) imports each file into BigQuery shortly after Adjust drops it, then the file just sits in GCS as a raw archive. That's exactly the access pattern Nearline is priced for, about half the per-GB-month storage cost of Standard, in exchange for a small retrieval fee that pays for itself in well under a month.
The 30-day minimum storage charge does not bite here, since you're not deleting files. Coldline is cheaper still but locks you in for 90 days, which is more commitment than you need on a fresh bucket. If you want to push savings further later on, add a lifecycle rule that bumps files to Coldline after 90 days.
3. Go to Adjust and create a raw data export. Select all the event triggers and all the placeholders. Every checkbox, every column. Having unnecessary columns is better than losing data. If you skip a column today and need it in two months, there is no backfill. That data is gone.
- In Adjust, open the app and go to All Settings → Raw Data Exports → New raw data export.
- Pick Google Cloud Storage as the destination.
- Enter the bucket name from step 2.
- Tick every event trigger. Tick every placeholder. Don't try to be selective.
4. Paste the service account JSON into the Adjust export screen and save.
- Open the JSON key file from step 1 in a text editor, select all, copy the full contents.
- Paste it into the credentials field on the Adjust export form.
- Save and activate the export. Files start landing in the bucket within the hour.
The IDs You Should Be Sending to Adjust
Attribution is a probability game. Every ID you give Adjust is one more match key it can use to tie an install or an event to the right user, the right device, the right campaign. So send everything you have.
- If you have a user ID, send it.
- Firebase pseudo ID? Send it.
- Other analytics IDs (Mixpanel, Amplitude, your in-house user ID)? Send it.
- AppLovin MAX ID? Send it.
This one change, sending the IDs you already have, often moves attribution rate by a few percentage points. There is no good reason not to.
How the Export Actually Behaves
Adjust drops files into the bucket hourly. Multiple files per hour, not one. There is no _SUCCESS marker, no manifest, nothing that tells you "this hour is done."
Two consequences for anyone consuming the GCS bucket downstream:
- Don't trust an hour as complete until the next hour has started. The current hour is always still being written to.
- Always scan the last two hours on every import run. The scan is idempotent, so re-running won't double-count, and you stop missing late-arriving data.
The Decision You Should Not Get Wrong
Once the Adjust export is live, do not change the configuration. Don't add a placeholder, don't remove one, don't toggle an event trigger. Here's why.
When you change the config, Adjust changes the file path. Not at the end, in the middle. BigQuery external tables only support a single * wildcard, so you cannot paper over a path that varies in two places. And because the export format is CSV, you also cannot lean on Parquet's schema evolution to absorb a column-order change.
So the order of operations matters. Turn on every event trigger and every placeholder in Adjust up front, then leave it alone. Adding a column you'll never query is free. Reshuffling the path two months later is not.
What to Do Today
Pick a game. Make the bucket, name it {company_name}-adjust-{game}, attach a service account with Storage Admin, and paste the JSON into Adjust with every checkbox ticked. Audit which IDs your app already has and start sending the ones you aren't sending yet. Then leave the config alone.