Exporting Firebase Data to BigQuery
Moving the Firebase data to BigQuery is a great way to get more out of your data, and here's how to do it.
dbt only handles transformations, leaving you with a complex stack. Bruin provides end-to-end pipelines with data ingestion, SQL & Python transformations, quality checks, and built-in orchestration—all in one open-source tool.
Burak Karakan
Co-founder & CEO
If you're building data pipelines in 2025, you've likely heard of dbt (data build tool). It's become the de facto standard for SQL transformations in the modern data stack. But here's the problem: dbt only solves one piece of the puzzle.
What if you could get ingestion, transformation, quality checks, and orchestration—all in one tool? That's exactly what Bruin delivers. Let's dive into why an end-to-end approach beats cobbling together multiple tools.
dbt focuses exclusively on transformation. It takes data that's already in your warehouse and transforms it using SQL (and limited Python). That's it. Everything else—getting data into your warehouse, scheduling pipelines, monitoring quality—requires additional tools.
A typical dbt stack looks like this:
That's 3-5 different tools to manage, configure, and integrate. Each one requires its own authentication, monitoring, and maintenance.
Bruin takes a different approach: everything you need in a single, unified framework.
With Bruin, you get:
One tool. One CLI. One configuration format. One deployment.
dbt doesn't handle data ingestion. You need to solve this yourself with:
Each solution comes with its own complexity: separate infrastructure, different authentication systems, and coordination headaches between tools.
Bruin's ingestion engine is built on ingestr, an open-source data ingestion tool with 100+ connectors. Define your ingestion with simple YAML:
name: raw.users
type: ingestr
parameters:
source_connection: postgresql
source_table: 'public.users'
destination: bigquery
incremental_strategy: merge
incremental_key: updated_at
primary_key: id
That's it. No separate infrastructure. No complex configuration. Just YAML.
Bruin supports multiple incremental loading strategies out of the box:
1. Append - Add new rows without touching existing data
incremental_strategy: append
incremental_key: created_at
2. Merge - Upsert based on primary keys (updates existing, inserts new)
incremental_strategy: merge
primary_key: id
incremental_key: updated_at
3. Delete+Insert - Delete matching rows, insert new ones
incremental_strategy: delete+insert
incremental_key: date
4. Replace - Full table replacement
incremental_strategy: replace
These strategies ensure you're only moving the data you need, not re-ingesting entire tables every time.
Bonus: Need a custom connector? Bruin offers a 1-week SLA for custom connector development.
This is where the differences become stark.
dbt added Python support (dbt-py) later in its lifecycle, and it shows:
If you want to use scikit-learn, TensorFlow, or any ML library, you're often out of luck with dbt.
Bruin was built from day one with native Python support:
Here's a real example:
# assets/ml_predictions.py
"""
@bruin
name: analytics.ml_predictions
type: python
depends:
- analytics.user_features # SQL asset
materialization:
type: table
@bruin
"""
import pandas as pd
from sklearn.ensemble import RandomForestClassifier
def main(connection):
# Load data from the SQL upstream dependency
df = connection.read_sql("SELECT * FROM analytics.user_features")
# Train your model
model = RandomForestClassifier()
# ... model training logic ...
# Return predictions as DataFrame
return predictions_df
This Python asset depends on a SQL asset (analytics.user_features
), trains a machine learning model, and outputs results that downstream SQL assets can use. This workflow is nearly impossible in dbt.
Bruin uses uv to manage isolated Python environments for each asset. This means:
# asset.yml
name: analytics.ml_model
type: python
parameters:
python_version: "3.11"
dependencies:
- pandas==2.0.0
- scikit-learn==1.3.0
- tensorflow==2.14.0
Good news if you're migrating from dbt: Bruin supports Jinja templating too.
-- assets/daily_revenue.sql
/*
@bruin
name: analytics.daily_revenue
depends:
- raw.orders
@bruin
*/
SELECT
DATE_TRUNC('day', created_at) as date,
SUM(amount) as revenue
FROM raw.orders
WHERE created_at >= '{{ start_date }}'
AND status = 'completed'
GROUP BY 1
Bruin supports Jinja templating with:
You're not losing familiar patterns—you're gaining capabilities.
dbt has no orchestration capabilities. You must use an external tool:
Apache Airflow
Dagster
Prefect
dbt Cloud
Each option means more infrastructure, more authentication to manage, and more tools to integrate.
Bruin has built-in orchestration. No Airflow. No Kubernetes. No complexity.
# Run your entire pipeline locally
bruin run
# Backfill historical data
bruin run --start-date 2024-01-01 --end-date 2024-12-31
Deployment options:
bruin run
on your machineThe single binary approach means you can deploy anywhere—no Docker, no Kubernetes, no complex orchestration infrastructure.
In dbt, tests are defined separately from transformations:
# schema.yml
models:
- name: users
columns:
- name: email
tests:
- not_null
- unique
Limitations:
dbt test
In Bruin, quality checks are embedded in your asset definitions and run automatically after each transformation:
# assets/users.sql
name: analytics.users
type: sql
materialization:
type: table
columns:
- name: email
type: string
checks:
- name: not_null
- name: unique
- name: revenue
type: float
checks:
- name: positive
- name: country_code
type: string
checks:
- name: accepted_values
value: ['US', 'UK', 'CA', 'DE', 'FR']
Key advantages:
Custom checks example:
custom_checks:
- name: revenue_matches_sum
query: |
SELECT COUNT(*) FROM analytics.users
WHERE total_revenue != (
SELECT SUM(order_amount)
FROM analytics.orders
WHERE user_id = users.id
)
value: 0
If this check fails, the pipeline stops. No bad data flows downstream.
As dbt projects grow, teams encounter:
Bruin is written in Go, which provides:
Weather intelligence company Buluttan migrated from dbt to Bruin and achieved:
"Bruin's product has effectively addressed all the challenges my team faced in developing, orchestrating, and monitoring our pipelines." — Arsalan Noorafkan, Team Lead Data Engineering at Buluttan
These aren't theoretical improvements—they're production results from a real engineering team.
The VS Code extension is a game-changer. You can:
┌─────────────────┐
│ Ingestion Tool │ (Fivetran/Airbyte)
└────────┬────────┘
│
┌────────▼────────┐
│ dbt │ (Transformations only)
└────────┬────────┘
│
┌────────▼────────┐
│ Orchestrator │ (Airflow/Dagster)
└────────┬────────┘
│
┌────────▼────────┐
│ Observability │ (Monte Carlo/dbt Cloud)
└────────┬────────┘
│
┌────────▼────────┐
│ Lineage/Catalog │ (Separate tool or dbt Cloud)
└─────────────────┘
Result: 3-5 tools to manage, configure, integrate, and maintain. Different authentication systems. Multiple points of failure. Complex deployment pipelines.
┌─────────────────────────────┐
│ Bruin CLI │
│ │
│ • Ingestion (100+ sources) │
│ • SQL Transformations │
│ • Python Execution │
│ • Quality Checks │
│ • Orchestration │
│ • Lineage │
│ • Observability │
└─────────────────────────────┘
Result: Single tool. One configuration format. One CLI. One deployment. Everything works together out of the box.
To be fair, there are scenarios where dbt makes sense:
Choose Bruin if you want:
Worried about switching? The migration path is straightforward:
Plus, you gain capabilities:
The modern data stack promised modularity—pick the best tool for each job. In practice, it delivered complexity.
dbt is excellent at what it does, but what it does is limited: SQL transformations. Everything else requires additional tools, infrastructure, and integration work.
Bruin delivers the full pipeline: ingestion, transformation (SQL and Python), quality checks, and orchestration—all in a single, fast, unified tool.
If you're starting a new data project, or if your dbt stack is getting unwieldy, give Bruin a try. It's:
The data pipelines of the future won't be held together by duct tape and five different tools. They'll be unified, fast, and simple.
Ready to try Bruin?
Install the CLI:
curl -LsSf https://getbruin.com/install/cli | sh
Try the quickstart:
bruin init my-pipeline
cd my-pipeline
bruin run
Resources:
Did you migrate from dbt to Bruin? We'd love to hear your story. Reach out on Slack or GitHub.
Moving the Firebase data to BigQuery is a great way to get more out of your data, and here's how to do it.
A comprehensive guide to querying and working with the Firebase events table in BigQuery, including useful functions and techniques for easier data analysis.
Fivetran only handles data ingestion, leaving you with a complex stack. Bruin provides end-to-end pipelines with ingestion, transformations, quality checks, and Python custom connectors—all in one open-source tool.