S3 Platform
Amazon S3 (Simple Storage Service) is supported in Bruin for both data ingestion and as sensors for monitoring object availability.
S3 Sensors
S3 sensors allow you to monitor for the existence of specific objects in S3 buckets. The sensor waits for a file or object to become available before allowing downstream assets to proceed.
Connection Configuration
Add an AWS connection to your bruin.yml file:
connections:
aws:
- name: "aws-default"
access_key: "your-access-key"
secret_key: "your-secret-key"
region: "us-east-1" # Optional - will be auto-discovered from bucket if not providedSensor Configuration
Create a sensor asset in your pipeline:
name: "wait_for_s3_file"
type: s3.sensor.key_sensor
connection: aws-default
parameters:
bucket_name: "my-data-bucket"
bucket_key: "path/to/expected/file.csv"Parameters
bucket_name(required): The name of the S3 bucket to monitorbucket_key(required): The key/path of the object to wait for
Sensor Modes
The sensor supports different modes, controlled via the --sensor-mode flag when running:
once(default): Check once and fail if object doesn't existwait: Continuously poll until object is found (24-hour timeout)skip: Skip sensor execution entirely
Running the Sensor
Execute the sensor using the bruin run command:
bruin run path/to/your/sensor.asset.yml --sensor-mode waitBehavior
- If the region is not specified in the connection, it will be auto-discovered from the bucket
- In wait mode, the sensor polls every few seconds (configurable via
poke_interval) as a parameter - Maximum timeout is 24 hours for continuous polling
- Returns error if object is not found in
oncemode
S3 for Data Ingestion
Bruin also supports S3 as a data source and destination for ingestion workflows. For comprehensive documentation on using S3 for data ingestion, including reading from and writing to S3 buckets, see the S3 Ingestion Guide.