5-minute tutorial
Migrate Databricks to Microsoft SQL Server in 60 Seconds
Learn how to copy your Databricks data to Microsoft SQL Server with a single command using ingestr - no code required.
What you'll learn
Prerequisites
- Python 3.8 or higher installed
- Databricks workspace with SQL endpoint
- Personal access token generated
- SQL endpoint running (not terminated)
- Appropriate permissions on catalog/schema
- SQL Server instance running
- SQL Server Authentication or Windows Authentication configured
- TCP/IP protocol enabled in SQL Server Configuration Manager
- Firewall rules for port 1433
Step 1: Install ingestr
Install ingestr in seconds using pip. Choose the method that works best for you:
Recommended: Using uv (fastest)
# Install uv first if you haven't already
pip install uv
# Run ingestr using uvx
uvx ingestr
Alternative: Global installation
# Install globally using uv
uv pip install --system ingestr
# Or using standard pip
pip install ingestr
Verify installation: Run ingestr --version
to confirm it's installed correctly.
Step 2: Your First Migration
Let's copy a table from Databricks to Microsoft SQL Server. This example shows a complete, working command you can adapt to your needs.
Set up your connections
Databricks connection format:
databricks://token@host:port/http_path
Parameters:
- • token: Personal access token (use as username)
- • host: Workspace URL
- • port: Port number (usually 443)
- • http_path: SQL endpoint HTTP path
Microsoft SQL Server connection format:
mssql://username:password@host:port/database
Parameters:
- • username: SQL Server login
- • password: Login password
- • host: Server name or IP
- • port: Port number (default 1433)
- • database: Database name
- • encrypt: Use encryption (true/false)
- • trustServerCertificate: Trust certificate
BigQuery Setup Required
Before running the command:
- Create a service account in Google Cloud Console
- Grant it BigQuery Data Editor and Job User roles
- Download the JSON key file
- Use the path to this file in your connection string
Run your first copy
Copy the entire users table from Databricks to Microsoft SQL Server:
ingestr ingest \
--source-uri 'databricks://[email protected]:443/sql/1.0/endpoints/abc123' \
--source-table 'bronze.raw_data' \
--dest-uri 'mssql://sa:MyPass123@localhost:1433/AdventureWorks' \
--dest-table 'raw.raw_data'
What this does:
- • Connects to your Databricks database
- • Reads all data from the specified table
- • Creates the table in Microsoft SQL Server if needed
- • Copies all rows to the destination
Command breakdown:
--source-uri
Your source database--source-table
Table to copy from--dest-uri
Your destination--dest-table
Where to write data
Step 3: Verify your data
After the migration completes, verify your data was copied correctly:
Check row count in Microsoft SQL Server:
-- Run this in Microsoft SQL Server
SELECT COUNT(*) as row_count
FROM raw.raw_data;
-- Check a sample of the data
SELECT *
FROM raw.raw_data
LIMIT 10;
Advanced Patterns
Once you've mastered the basics, use these patterns for production workloads.
Only copy new or updated records since the last sync. Perfect for daily updates.
ingestr ingest \
--source-uri 'databricks://[email protected]:443/sql/1.0/endpoints/abc123' \
--source-table 'public.orders' \
--dest-uri 'mssql://sa:MyPass123@localhost:1433/AdventureWorks' \
--dest-table 'raw.orders' \
--incremental-strategy merge \
--incremental-key updated_at \
--primary-key order_id
How it works: The merge strategy updates existing rows and inserts new ones based on the primary key. Only rows where updated_at
has changed will be processed.
Common Use Cases
Ready-to-use commands for typical Databricks to Microsoft SQL Server scenarios.
Daily Customer Data Sync
Keep your analytics warehouse updated with the latest customer information every night.
# Add this to your cron job or scheduler
ingestr ingest \
--source-uri 'databricks://[email protected]:443/sql/1.0/endpoints/abc123' \
--source-table 'public.customers' \
--dest-uri 'mssql://sa:MyPass123@localhost:1433/AdventureWorks' \
--dest-table 'analytics.customers' \
--incremental-strategy merge \
--incremental-key updated_at \
--primary-key customer_id
Historical Data Migration
One-time migration of all historical records to your data warehouse.
# One-time full table copy
ingestr ingest \
--source-uri 'databricks://[email protected]:443/sql/1.0/endpoints/abc123' \
--source-table 'public.transactions' \
--dest-uri 'mssql://sa:MyPass123@localhost:1433/AdventureWorks' \
--dest-table 'warehouse.transactions_historical'
Development Environment Sync
Copy production data to your development Microsoft SQL Server instance (with sensitive data excluded).
# Copy sample data to development
ingestr ingest \
--source-uri 'databricks://[email protected]:443/sql/1.0/endpoints/abc123' \
--source-table 'public.products' \
--dest-uri 'mssql://sa:MyPass123@localhost:1433/AdventureWorks' \
--dest-table 'dev.products' \
--limit 1000 # Only copy 1000 rows for testing
Troubleshooting Guide
Solutions to common issues when migrating from Databricks to Microsoft SQL Server.
Connection refused or timeout errors
Check your connection details:
- Ensure SQL endpoint is running
- Verify personal access token is valid
- Check workspace URL is correct
- Confirm HTTP path matches your endpoint
- Enable TCP/IP in SQL Server Configuration Manager
- Check SQL Server Browser service is running
- Verify Windows Firewall allows SQL Server
- Ensure Mixed Mode authentication if using SQL login
Authentication failures
Common authentication issues:
- Ensure SQL endpoint is running
- Verify personal access token is valid
- Check workspace URL is correct
- Confirm HTTP path matches your endpoint
- Enable TCP/IP in SQL Server Configuration Manager
- Check SQL Server Browser service is running
- Verify Windows Firewall allows SQL Server
- Ensure Mixed Mode authentication if using SQL login
Schema or data type mismatches
Handling data type differences:
- ingestr automatically handles most type conversions
- Databricks: Delta tables support schema evolution
- Databricks: Complex types (arrays, maps, structs) supported
- Databricks: Photon acceleration for certain operations
- Databricks: Partitioning affects query performance
- Microsoft SQL Server: NVARCHAR for Unicode support
- Microsoft SQL Server: Hierarchyid for tree structures
- Microsoft SQL Server: Spatial data types for geographic data
- Microsoft SQL Server: XML and JSON support
Performance issues with large tables
Optimize large data transfers:
- Use incremental loading to process data in chunks
- Run migrations during off-peak hours
- Split very large tables by date ranges using interval parameters
Ready to scale your data pipeline?
You've learned how to migrate data from Databricks to Microsoft SQL Server with ingestr. For production workloads with monitoring, scheduling, and data quality checks, explore Bruin Cloud.