Deploying Bruin with AWS ECS
Managed Option Available
Looking for a fully managed solution? Bruin Cloud provides managed orchestration, monitoring, and scheduling without the operational overhead. Try it free!
This guide shows you how to run Bruin pipelines using AWS ECS (Elastic Container Service). ECS provides container orchestration for longer-running tasks and complex workflows.
Prerequisites
Before you begin, ensure you have:
- An AWS account with appropriate permissions
- AWS CLI installed and configured
- Docker installed
- A Bruin project ready to deploy
- Credentials for your data platforms
Overview
AWS ECS allows you to run Docker containers at scale. For Bruin, we'll use:
- ECS Fargate for serverless container execution
- ECR for storing Docker images
- EventBridge for scheduling
- Secrets Manager for storing credentials
- CloudWatch for monitoring
Step 1: Create ECR Repository
# Set your AWS account ID and region
AWS_ACCOUNT_ID=123456789012
AWS_REGION=us-east-1
# Create ECR repository
aws ecr create-repository \
--repository-name bruin-pipeline \
--region ${AWS_REGION}
# Authenticate Docker to ECR
aws ecr get-login-password --region ${AWS_REGION} | \
docker login --username AWS --password-stdin \
${AWS_ACCOUNT_ID}.dkr.ecr.${AWS_REGION}.amazonaws.com
Step 2: Create Dockerfile
Create a Dockerfile
in your project root:
FROM ghcr.io/bruin-data/bruin:latest
# Copy your Bruin project
COPY . /workspace
WORKDIR /workspace
# Create entrypoint script
COPY entrypoint.sh /entrypoint.sh
RUN chmod +x /entrypoint.sh
ENTRYPOINT ["/entrypoint.sh"]
Create entrypoint.sh
:
#!/bin/sh
set -e
# Get configuration from environment or AWS Secrets Manager
if [ -n "$AWS_SECRET_NAME" ]; then
echo "Fetching configuration from AWS Secrets Manager..."
BRUIN_CONFIG=$(aws secretsmanager get-secret-value \
--secret-id $AWS_SECRET_NAME \
--query SecretString \
--output text \
--region ${AWS_REGION:-us-east-1})
echo "$BRUIN_CONFIG" > .bruin.yml
fi
# Default values
PIPELINE=${PIPELINE:-.}
ENVIRONMENT=${ENVIRONMENT:-production}
COMMAND=${COMMAND:-run}
# Execute Bruin command
echo "Executing: bruin $COMMAND $PIPELINE --environment $ENVIRONMENT"
exec bruin $COMMAND $PIPELINE --environment $ENVIRONMENT
Step 3: Build and Push Docker Image
# Build image
docker build -t bruin-pipeline:latest .
# Tag image for ECR
docker tag bruin-pipeline:latest \
${AWS_ACCOUNT_ID}.dkr.ecr.${AWS_REGION}.amazonaws.com/bruin-pipeline:latest
# Push to ECR
docker push ${AWS_ACCOUNT_ID}.dkr.ecr.${AWS_REGION}.amazonaws.com/bruin-pipeline:latest
Step 4: Store Credentials in AWS Secrets Manager
# Create secret with your .bruin.yml content
aws secretsmanager create-secret \
--name bruin-config \
--description "Bruin pipeline configuration" \
--secret-string file://.bruin.yml \
--region ${AWS_REGION}
Step 5: Create ECS Task Definition
Create task-definition.json
:
{
"family": "bruin-pipeline",
"networkMode": "awsvpc",
"requiresCompatibilities": ["FARGATE"],
"cpu": "1024",
"memory": "2048",
"executionRoleArn": "arn:aws:iam::123456789012:role/ecsTaskExecutionRole",
"taskRoleArn": "arn:aws:iam::123456789012:role/bruinTaskRole",
"containerDefinitions": [
{
"name": "bruin",
"image": "123456789012.dkr.ecr.us-east-1.amazonaws.com/bruin-pipeline:latest",
"essential": true,
"environment": [
{
"name": "AWS_SECRET_NAME",
"value": "bruin-config"
},
{
"name": "AWS_REGION",
"value": "us-east-1"
},
{
"name": "PIPELINE",
"value": "."
},
{
"name": "ENVIRONMENT",
"value": "production"
}
],
"logConfiguration": {
"logDriver": "awslogs",
"options": {
"awslogs-group": "/ecs/bruin-pipeline",
"awslogs-region": "us-east-1",
"awslogs-stream-prefix": "bruin"
}
}
}
]
}
Step 6: Create IAM Roles
Task Execution Role
This role is used by ECS to pull images and write logs:
# Create trust policy
cat > ecs-trust-policy.json <<EOF
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"Service": "ecs-tasks.amazonaws.com"
},
"Action": "sts:AssumeRole"
}
]
}
EOF
# Create execution role
aws iam create-role \
--role-name ecsTaskExecutionRole \
--assume-role-policy-document file://ecs-trust-policy.json
# Attach AWS managed policy
aws iam attach-role-policy \
--role-name ecsTaskExecutionRole \
--policy-arn arn:aws:iam::aws:policy/service-role/AmazonECSTaskExecutionRolePolicy
Task Role
This role is used by your container to access AWS services:
# Create task role
aws iam create-role \
--role-name bruinTaskRole \
--assume-role-policy-document file://ecs-trust-policy.json
# Create policy for Secrets Manager access
cat > bruin-task-policy.json <<EOF
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"secretsmanager:GetSecretValue"
],
"Resource": "arn:aws:secretsmanager:${AWS_REGION}:${AWS_ACCOUNT_ID}:secret:bruin-config-*"
},
{
"Effect": "Allow",
"Action": [
"logs:CreateLogGroup",
"logs:CreateLogStream",
"logs:PutLogEvents"
],
"Resource": "*"
}
]
}
EOF
aws iam put-role-policy \
--role-name bruinTaskRole \
--policy-name BruinTaskPolicy \
--policy-document file://bruin-task-policy.json
Step 7: Create CloudWatch Log Group
aws logs create-log-group \
--log-group-name /ecs/bruin-pipeline \
--region ${AWS_REGION}
Step 8: Register Task Definition
aws ecs register-task-definition \
--cli-input-json file://task-definition.json \
--region ${AWS_REGION}
Step 9: Create ECS Cluster
aws ecs create-cluster \
--cluster-name bruin-cluster \
--region ${AWS_REGION}
Running Tasks
Run Task Manually
aws ecs run-task \
--cluster bruin-cluster \
--task-definition bruin-pipeline \
--launch-type FARGATE \
--network-configuration "awsvpcConfiguration={subnets=[subnet-12345678],securityGroups=[sg-12345678],assignPublicIp=ENABLED}" \
--region ${AWS_REGION}
Run Specific Pipeline
Override environment variables:
aws ecs run-task \
--cluster bruin-cluster \
--task-definition bruin-pipeline \
--launch-type FARGATE \
--network-configuration "awsvpcConfiguration={subnets=[subnet-12345678],securityGroups=[sg-12345678],assignPublicIp=ENABLED}" \
--overrides '{
"containerOverrides": [{
"name": "bruin",
"environment": [
{"name": "PIPELINE", "value": "pipelines/analytics"},
{"name": "ENVIRONMENT", "value": "production"}
]
}]
}' \
--region ${AWS_REGION}
Scheduling with EventBridge
Create EventBridge Rule
# Create rule for daily execution at 3 AM UTC
aws events put-rule \
--name bruin-daily-run \
--description "Run Bruin pipeline daily at 3 AM" \
--schedule-expression "cron(0 3 * * ? *)" \
--region ${AWS_REGION}
# Create IAM role for EventBridge to run ECS tasks
cat > eventbridge-trust-policy.json <<EOF
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"Service": "events.amazonaws.com"
},
"Action": "sts:AssumeRole"
}
]
}
EOF
aws iam create-role \
--role-name ecsEventsRole \
--assume-role-policy-document file://eventbridge-trust-policy.json
cat > eventbridge-policy.json <<EOF
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"ecs:RunTask"
],
"Resource": "arn:aws:ecs:${AWS_REGION}:${AWS_ACCOUNT_ID}:task-definition/bruin-pipeline:*"
},
{
"Effect": "Allow",
"Action": "iam:PassRole",
"Resource": [
"arn:aws:iam::${AWS_ACCOUNT_ID}:role/ecsTaskExecutionRole",
"arn:aws:iam::${AWS_ACCOUNT_ID}:role/bruinTaskRole"
]
}
]
}
EOF
aws iam put-role-policy \
--role-name ecsEventsRole \
--policy-name ECSEventsPolicy \
--policy-document file://eventbridge-policy.json
# Add target
aws events put-targets \
--rule bruin-daily-run \
--targets "Id"="1",\
"Arn"="arn:aws:ecs:${AWS_REGION}:${AWS_ACCOUNT_ID}:cluster/bruin-cluster",\
"RoleArn"="arn:aws:iam::${AWS_ACCOUNT_ID}:role/ecsEventsRole",\
"EcsParameters"="{TaskDefinitionArn=arn:aws:ecs:${AWS_REGION}:${AWS_ACCOUNT_ID}:task-definition/bruin-pipeline,TaskCount=1,LaunchType=FARGATE,NetworkConfiguration={awsvpcConfiguration={Subnets=[subnet-12345678],SecurityGroups=[sg-12345678],AssignPublicIp=ENABLED}}}" \
--region ${AWS_REGION}
Using Step Functions for Complex Workflows
Create Step Functions State Machine
Create state-machine.json
:
{
"Comment": "Bruin pipeline orchestration with ECS",
"StartAt": "Validate",
"States": {
"Validate": {
"Type": "Task",
"Resource": "arn:aws:states:::ecs:runTask.sync",
"Parameters": {
"LaunchType": "FARGATE",
"Cluster": "arn:aws:ecs:us-east-1:123456789012:cluster/bruin-cluster",
"TaskDefinition": "arn:aws:ecs:us-east-1:123456789012:task-definition/bruin-pipeline",
"NetworkConfiguration": {
"AwsvpcConfiguration": {
"Subnets": ["subnet-12345678"],
"SecurityGroups": ["sg-12345678"],
"AssignPublicIp": "ENABLED"
}
},
"Overrides": {
"ContainerOverrides": [
{
"Name": "bruin",
"Environment": [
{"Name": "COMMAND", "Value": "validate"},
{"Name": "PIPELINE", "Value": "."}
]
}
]
}
},
"Next": "Ingestion"
},
"Ingestion": {
"Type": "Task",
"Resource": "arn:aws:states:::ecs:runTask.sync",
"Parameters": {
"LaunchType": "FARGATE",
"Cluster": "arn:aws:ecs:us-east-1:123456789012:cluster/bruin-cluster",
"TaskDefinition": "arn:aws:ecs:us-east-1:123456789012:task-definition/bruin-pipeline",
"NetworkConfiguration": {
"AwsvpcConfiguration": {
"Subnets": ["subnet-12345678"],
"SecurityGroups": ["sg-12345678"],
"AssignPublicIp": "ENABLED"
}
},
"Overrides": {
"ContainerOverrides": [
{
"Name": "bruin",
"Environment": [
{"Name": "PIPELINE", "Value": "pipelines/ingestion"}
]
}
]
}
},
"Next": "ParallelProcessing"
},
"ParallelProcessing": {
"Type": "Parallel",
"Branches": [
{
"StartAt": "Analytics",
"States": {
"Analytics": {
"Type": "Task",
"Resource": "arn:aws:states:::ecs:runTask.sync",
"Parameters": {
"LaunchType": "FARGATE",
"Cluster": "arn:aws:ecs:us-east-1:123456789012:cluster/bruin-cluster",
"TaskDefinition": "arn:aws:ecs:us-east-1:123456789012:task-definition/bruin-pipeline",
"NetworkConfiguration": {
"AwsvpcConfiguration": {
"Subnets": ["subnet-12345678"],
"SecurityGroups": ["sg-12345678"],
"AssignPublicIp": "ENABLED"
}
},
"Overrides": {
"ContainerOverrides": [
{
"Name": "bruin",
"Environment": [
{"Name": "PIPELINE", "Value": "pipelines/analytics"}
]
}
]
}
},
"End": true
}
}
},
{
"StartAt": "Reporting",
"States": {
"Reporting": {
"Type": "Task",
"Resource": "arn:aws:states:::ecs:runTask.sync",
"Parameters": {
"LaunchType": "FARGATE",
"Cluster": "arn:aws:ecs:us-east-1:123456789012:cluster/bruin-cluster",
"TaskDefinition": "arn:aws:ecs:us-east-1:123456789012:task-definition/bruin-pipeline",
"NetworkConfiguration": {
"AwsvpcConfiguration": {
"Subnets": ["subnet-12345678"],
"SecurityGroups": ["sg-12345678"],
"AssignPublicIp": "ENABLED"
}
},
"Overrides": {
"ContainerOverrides": [
{
"Name": "bruin",
"Environment": [
{"Name": "PIPELINE", "Value": "pipelines/reporting"}
]
}
]
}
},
"End": true
}
}
}
],
"End": true
}
}
}
Deploy the state machine:
# Create role for Step Functions
cat > sfn-trust-policy.json <<EOF
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"Service": "states.amazonaws.com"
},
"Action": "sts:AssumeRole"
}
]
}
EOF
aws iam create-role \
--role-name StepFunctionsECSRole \
--assume-role-policy-document file://sfn-trust-policy.json
cat > sfn-policy.json <<EOF
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"ecs:RunTask",
"ecs:StopTask",
"ecs:DescribeTasks"
],
"Resource": "*"
},
{
"Effect": "Allow",
"Action": [
"iam:PassRole"
],
"Resource": [
"arn:aws:iam::${AWS_ACCOUNT_ID}:role/ecsTaskExecutionRole",
"arn:aws:iam::${AWS_ACCOUNT_ID}:role/bruinTaskRole"
]
},
{
"Effect": "Allow",
"Action": [
"events:PutTargets",
"events:PutRule",
"events:DescribeRule"
],
"Resource": "*"
}
]
}
EOF
aws iam put-role-policy \
--role-name StepFunctionsECSRole \
--policy-name StepFunctionsECSPolicy \
--policy-document file://sfn-policy.json
# Create state machine
aws stepfunctions create-state-machine \
--name bruin-pipeline-orchestrator \
--definition file://state-machine.json \
--role-arn arn:aws:iam::${AWS_ACCOUNT_ID}:role/StepFunctionsECSRole \
--region ${AWS_REGION}
Using ECS Service for Long-Running Tasks
For continuously running pipelines (e.g., streaming):
aws ecs create-service \
--cluster bruin-cluster \
--service-name bruin-streaming \
--task-definition bruin-pipeline \
--desired-count 1 \
--launch-type FARGATE \
--network-configuration "awsvpcConfiguration={subnets=[subnet-12345678],securityGroups=[sg-12345678],assignPublicIp=ENABLED}" \
--region ${AWS_REGION}
Monitoring and Logging
View Logs
# List log streams
aws logs describe-log-streams \
--log-group-name /ecs/bruin-pipeline \
--order-by LastEventTime \
--descending \
--max-items 10 \
--region ${AWS_REGION}
# View logs
aws logs tail /ecs/bruin-pipeline --follow --region ${AWS_REGION}
Create CloudWatch Dashboard
cat > dashboard.json <<EOF
{
"widgets": [
{
"type": "metric",
"properties": {
"metrics": [
["AWS/ECS", "CPUUtilization", {"stat": "Average"}],
[".", "MemoryUtilization", {"stat": "Average"}]
],
"period": 300,
"stat": "Average",
"region": "${AWS_REGION}",
"title": "ECS Resource Utilization"
}
}
]
}
EOF
aws cloudwatch put-dashboard \
--dashboard-name BruinPipeline \
--dashboard-body file://dashboard.json \
--region ${AWS_REGION}
Auto Scaling
Configure auto scaling for your ECS service:
# Register scalable target
aws application-autoscaling register-scalable-target \
--service-namespace ecs \
--resource-id service/bruin-cluster/bruin-streaming \
--scalable-dimension ecs:service:DesiredCount \
--min-capacity 1 \
--max-capacity 10 \
--region ${AWS_REGION}
# Create scaling policy
aws application-autoscaling put-scaling-policy \
--service-namespace ecs \
--resource-id service/bruin-cluster/bruin-streaming \
--scalable-dimension ecs:service:DesiredCount \
--policy-name bruin-cpu-scaling \
--policy-type TargetTrackingScaling \
--target-tracking-scaling-policy-configuration '{
"TargetValue": 75.0,
"PredefinedMetricSpecification": {
"PredefinedMetricType": "ECSServiceAverageCPUUtilization"
},
"ScaleInCooldown": 300,
"ScaleOutCooldown": 60
}' \
--region ${AWS_REGION}
Best Practices
1. Use Fargate Spot for Cost Savings
For non-critical workloads:
aws ecs run-task \
--cluster bruin-cluster \
--task-definition bruin-pipeline \
--capacity-provider-strategy capacityProvider=FARGATE_SPOT,weight=1 \
--network-configuration "awsvpcConfiguration={subnets=[subnet-12345678],securityGroups=[sg-12345678],assignPublicIp=ENABLED}"
2. Set Resource Limits
Optimize CPU and memory in task definition:
{
"cpu": "2048",
"memory": "4096"
}
3. Use Task Metadata Endpoint
Access task metadata from within containers:
TASK_METADATA=$(curl ${ECS_CONTAINER_METADATA_URI_V4}/task)
4. Implement Health Checks
Add health checks to your task definition:
{
"healthCheck": {
"command": ["CMD-SHELL", "bruin validate . || exit 1"],
"interval": 30,
"timeout": 5,
"retries": 3,
"startPeriod": 60
}
}
5. Tag Resources
aws ecs tag-resource \
--resource-arn arn:aws:ecs:${AWS_REGION}:${AWS_ACCOUNT_ID}:task-definition/bruin-pipeline:1 \
--tags key=Project,value=Bruin key=Environment,value=Production
Troubleshooting
Task Fails to Start
Check task stopped reason:
aws ecs describe-tasks \
--cluster bruin-cluster \
--tasks TASK_ID \
--query 'tasks[0].stopReason' \
--region ${AWS_REGION}
Cannot Pull ECR Image
Ensure task execution role has ECR permissions:
aws iam attach-role-policy \
--role-name ecsTaskExecutionRole \
--policy-arn arn:aws:iam::aws:policy/AmazonEC2ContainerRegistryReadOnly
Network Issues
Verify:
- Subnets have route to internet (NAT Gateway or Internet Gateway)
- Security group allows outbound traffic
- Task has public IP if accessing internet directly
High Costs
- Use Fargate Spot for non-critical tasks
- Right-size CPU and memory
- Set appropriate task timeout
- Use auto-scaling to scale down when idle
Next Steps
- Explore Bruin Cloud for managed orchestration
- Learn about AWS Lambda for shorter tasks
- Review quality checks to add validation