The Bruin Blog
Insights, ideas, and stories from the Bruin team.

How to Build a Salesforce to Snowflake Pipeline with Bruin
A simple guide to ingesting Salesforce data into Snowflake, modelling bronze, silver, and gold layers, and using Bruin agents for dashboards, Slack reports, activation, and self-healing pipelines.
Arsalan Noorafkan
8 min read

Learning AI Programming, Agentic Data Engineering, and AI Data Analysis
A practical guide to the best open and official courses for AI programming, agentic data engineering, and AI data analysis - organized by career path, experience level, and project goal.
Arsalan Noorafkan
18 min read

dlt Alternatives: Bruin, Airbyte, Sling, Meltano, and Fivetran Compared
A practical comparison of dlt alternatives for data ingestion: Bruin CLI, ingestr, Airbyte, Sling, Meltano, and Fivetran, including a MongoDB to Postgres benchmark.
Arsalan Noorafkan
10 min read

BigQuery TVFs for AI Agents: A Lightweight Semantic Layer Pattern
BigQuery table-valued functions can give AI data agents a small, governed query interface without asking them to write full SQL against raw warehouse tables.
Arsalan Noorafkan
4 min read

Best Semantic Layer Tools in 2026: dbt, Cube, Looker, Power BI, Lightdash, Bruin
Compare the best semantic layer tools in 2026: dbt Semantic Layer, Cube, Looker, Power BI, Tableau, AtScale, Lightdash, and Bruin CLI / DAC. See use cases, tradeoffs, governance models, and AI-agent fit.
Arsalan Noorafkan
10 min read

What Is a Semantic Layer?
A practical explanation of semantic layers, why metric definitions drift across dashboards, and how tools like Bruin CLI and DAC turn reusable metrics, dimensions, filters, and segments into SQL.
Arsalan Noorafkan
8 min read
Fable 5 vs Bruin for Data Analysis: Can a Frontier Model Be Your Data Analyst?
Fable 5 is one of the most capable AI models yet, and people are asking whether it can replace a data analyst. Here is an honest look at what Fable 5 does well for data analysis, where it stops, and how it compares to a purpose-built AI data analyst like Bruin.
Kateryna Kozachenko
7 min read

The 9 Best AI Dashboard Builders in 2026
An honest 2026 guide to AI dashboard tools that turn plain-English questions into live dashboards. Bruin, ThoughtSpot, Hex, Power BI Copilot, Looker, Basedash, Omni, Tableau Pulse, and Dot, with what each is good at, which connect to live company data, which have an API, and which can actually replace traditional BI.
Kateryna Kozachenko
15 min read
The Best Data Ingestion Tools in 2026
An honest 2026 guide to data ingestion tools, from Fivetran and Airbyte to dlt, Sling, Meltano, and ingestr. Which are open source, which connect to live company data, which support CDC and incremental loads, and which fit a modern AI data stack.
Kateryna Kozachenko
12 min read

Migrating from Python to Go: ingestr v1
We have rebuilt ingestr from the ground up to be faster, more reliable, and easier to use. Here is how we did it and what it means for your data pipelines.
Burak Karakan
6 min read

The Best AI BI Tools in 2026
An honest 2026 guide to AI business intelligence tools, from ThoughtSpot, Power BI Copilot, Tableau Pulse, and Looker to Snowflake Cortex, Databricks Genie, and Bruin. Which let business teams ask questions in plain English, which connect to live company data, and which can actually replace a BI stack.
Kateryna Kozachenko
12 min read

Best Reverse ETL Tools 2026: Segment vs Hightouch vs Census
Which reverse ETL tool works best with Twilio Segment, Hightouch, or Fivetran Census? Compare the top 2026 data activation tools by use case, governance, AI activation, open-source options, and destination fit.
Arsalan Noorafkan
18 min read
AI Data Analyst vs ChatGPT, Claude, and Coding Agents: What's the Difference?
Can you just use ChatGPT, Claude, or a coding agent like Codex to analyze your company data? Here is the honest difference between a general AI model and a purpose-built AI data analyst, why a model alone is not enough, and what it takes to get trustworthy answers from live company data.
Kateryna Kozachenko
10 min read

The Best Power BI Copilot Alternatives in 2026
Looking for a Power BI Copilot alternative? An honest 2026 comparison of ThoughtSpot, Tableau Pulse, Looker, Snowflake Cortex, Sigma, and Bruin, for teams that want plain-English answers on live data without per-seat licensing or DAX.
Kateryna Kozachenko
10 min read

Best Data Pipeline Tools 2026: Airflow, Dagster, Prefect, Mage, and Bruin
A practical 2026 shortlist of the best data pipeline tools. Compare Airflow, Dagster, Prefect, Mage, and Bruin by scope, ops weight, lineage, quality checks, AI-agent fit, and migration path.
Arsalan Noorafkan
9 min read

Score XGBoost models in BigQuery, no Python required
Most batch ML scoring pipelines waste DS time pulling data into Python containers to do arithmetic the warehouse already does. Here is the trick, the receipt, and a copy-paste Jinja macro that translates an XGBoost model to SQL, validated on DuckDB.
Sabri Karagonen
8 min read

How to Get Full Attribution Coverage Between Adjust and Firebase
A one-time SDK setup for game studios: cross-write Adjust ADID, Firebase user_pseudo_id, and your own user_id so installs join cleanly across both systems. AppsFlyer, Singular, and Branch follow the same pattern.
Sabri Karagonen
6 min read
Answer, Build, Act: What the Next AI Data Analyst Actually Does
AI data analysts started by answering questions. The useful ones now also build (dashboards, reports, pipelines) and act (pause bad ad spend, fix broken reports, alert the right owner). Here is why answering alone is not enough, and what it takes to act safely on live company data.
Kateryna Kozachenko
8 min read

Deterministic A/B Test Bucketing
Make A/B test bucketing a pure function of (salt, user_id) so iOS, Android, web, and BigQuery all derive the same variant for the same user, and so you can preview cohort balance in the warehouse before launch.
Sabri Karagonen
7 min read

Exporting Adjust Raw Data to Google Cloud Storage
A short setup guide for getting Adjust raw exports landing in a GCS bucket, plus the parameters you should be sending to Adjust to get attribution right.
Sabri Karagonen
5 min read

How to Run Reliable Firebase A/B Tests
Firebase counts users as 'in variant B' when the variant never actually reached their device. Here's the proxy-parameter setup that gives you a cohort you can defend.
Sabri Karagonen
7 min read
The Best Text-to-SQL Tools in 2026
An honest 2026 guide to text-to-SQL tools that turn natural language into queries, from Vanna AI, WrenAI, and Defog to Snowflake Cortex, Databricks Genie, and Bruin. Which are open source, which connect to live company data, and which go beyond generating SQL to deliver trustworthy answers.
Kateryna Kozachenko
11 min read

Why It's Reasonable to Be Skeptical About AI in Data - and Why It's Fixable
A practical framework for building an AI context layer using open-source tools, turning skepticism about AI in data engineering into a working solution with self-healing pipelines and iterative team adoption.
Arsalan Noorafkan
25 min read
The Best Analytics Tools for Mobile Gaming Studios in 2026
What tools do mobile gaming companies actually use for business analytics? An honest 2026 guide to the gaming analytics stack, from Firebase, GameAnalytics, and Amplitude to Adjust, AppsFlyer, BigQuery, Snowflake, and AI data analysts like Bruin, grouped by the job each one does.
Kateryna Kozachenko
12 min read

AI Data Analyst on WhatsApp
Most AI data analysts live in Slack or a browser. Bruin runs in WhatsApp too. Here is why field, sales, and ops teams prefer asking their data questions there, what it takes to make it actually work, and how to roll it out safely.
Kateryna Kozachenko
14 min read

The Best AI Data Analyst Tools for Slack in 2026
An honest 2026 guide to AI data analyst tools that live natively in Slack. Bruin, Dot, Querio, ThoughtSpot, Question Base, Clearfeed, and eesel AI compared, with pros, cons, and which one fits which kind of team.
Kateryna Kozachenko
14 min read

From Prompt to Dashboard: How Conversational AI Is Replacing the BI Request Queue
For 20 years, self-serve BI has meant 'learn to build your own dashboard.' In 2026, prompting replaces point-and-click, and the BI request queue dies with it. A practical look at where conversational BI works, where it does not, and how to run a data team around it.
Kateryna Kozachenko
18 min read

AI Data Analyst vs Traditional BI: How to Choose in 2026
Honest 2026 framework for picking between an AI data analyst and traditional BI tools. When each one wins, the hybrid pattern most teams land on, and how to migrate without breaking trust in your data.
Kateryna Kozachenko
12 min read

Building an AI Data Analyst Sucks
I'll teach you how to do this, and you'll get mad at me for it.
Burak Karakan
6 min read

Meet Bruin’s AI data analyst in Slack, Teams, and browser
Bruin’s AI data analyst is an AI-native BI interface for asking questions about company data and getting back answers that are fast, relevant, and usable in context.
Kateryna Kozachenko
11 min read

Go is the Best Language for AI Agents
Pull up your agents folks, I'll convince you why Go is the best language for them.
Burak Karakan
8 min read

The 8 Best AI Data Analyst Tools in 2026
An honest 2026 guide to the AI data analyst tools worth shortlisting. Bruin, ThoughtSpot, Hex, Dot, Seek AI, Defog, Power BI Copilot, and ChatGPT with MCP - with pros, cons, pricing, and when each one actually fits across SaaS, ecommerce, gaming, and agencies.
Kateryna Kozachenko
18 min read

Bruin VS Code Extension: The Architectural Challenge of Integrating Vue.js Webviews
How we built a rich, interactive VS Code extension using Vue.js webviews, bridging Node.js extension code with a modern frontend through message passing.
Djamila Baroudi
6 min read

Introducing Bruin MCP: Your AI Agent's Data Toolkit
Bruin now supports the Model Context Protocol, letting AI agents in Cursor, Claude Code, and other editors query databases, ingest data, compare tables, and build pipelines-all through natural language.
Burak Karakan
6 min read

My 3 Month Internship Journey
My first internship experience at Bruin, where I shipped real features and learned a lot.
Mustafa Ersan
5 min read

Python vs SQL: Choosing the Right Tool
A practical guide to choosing between Python and SQL for data transformations. Learn when to use each tool, common antipatterns to avoid, and decision frameworks that work.
Burak Karakan
12 min read

dbt vs Bruin: Why End-to-End Wins Over Transformation-Only
dbt only handles transformations, leaving you with a complex stack. Bruin provides end-to-end pipelines with data ingestion, SQL & Python transformations, quality checks, and built-in orchestration-all in one open-source tool.
Burak Karakan
15 min read

The Effective LLM Multi-Tenant Security Solution
A practical pattern to secure LLM-generated SQL in multi-tenant systems by pre-filtering data with CTEs so the model never sees cross-tenant rows.
Sabri Karagonen
12 min read

Fivetran vs Bruin: Beyond Data Ingestion
Fivetran only handles data ingestion, leaving you with a complex stack. Bruin provides end-to-end pipelines with ingestion, transformations, quality checks, and Python custom connectors-all in one open-source tool.
Burak Karakan
12 min read

How I Survived (and Thrived) in the Zombie Apocalypse
A story about hunting zombie tasks in a distributed environment
Alberto Gomez
10 min read

The Hidden Costs of DIY Data Pipelines
Building your own data pipelines seems cost-effective until you do the math. Here's a detailed breakdown of what companies actually spend on homegrown solutions.
Burak Karakan
10 min read

Launch: Bruin CLI
Bruin CLI is an open-source data pipeline tool built with Go, with built-in data ingestion, transformation, and data quality checks.
Burak Karakan
8 min read
No-code data platform is a lie
A critical look at the limitations of no-code data platforms and why code-first approaches provide more flexibility and long-term value for growing data teams.
Burak Karakan
9 min read
Summarising User Behaviour: The Users Daily Table
Creating a comprehensive daily user behavior table in BigQuery using Firebase analytics data to track user engagement metrics and analyze patterns over time.
Sabri Karagonen
14 min read

Unnesting Firebase Events Table
A step-by-step guide to unnesting and transforming Firebase events data in BigQuery for easier analysis and more efficient queries.
Sabri Karagonen
12 min read
The Pains of Data Ingestion
Why is data ingestion so hard? This post explores the challenges of data ingestion and introduces ingestr, an open-source solution to simplify the process.
Burak Karakan
8 min read

Firebase Events Table
A comprehensive guide to querying and working with the Firebase events table in BigQuery, including useful functions and techniques for easier data analysis.
Sabri Karagonen
15 min read

The Mythical Data Team
How companies are approaching data teams wrong, and why a cultural shift towards treating data as a core value is needed for organizations to become truly data-driven.
Burak Karakan
6 min read
Firebase Analytics BigQuery Export: Official Docs and Settings
Use the official Firebase BigQuery export flow, then fix the settings most teams miss: region, streaming export, advertising identifiers, and the 60-day table expiry.
Sabri Karagonen
5 min read