CrateDB
CrateDB is a distributed SQL database for real-time search and analytics workloads.
Bruin supports CrateDB as a source for Ingestr assets, and you can use it to ingest data from CrateDB into your data warehouse.
In order to set up a CrateDB connection, you need to add a configuration item in the .bruin.yml file and in the asset file.
Follow the steps below to correctly set up CrateDB as a data source and run ingestion.
Configuration
Step 1: Add a connection to .bruin.yml file
To connect to CrateDB, add a cratedb connection to the connections section of the .bruin.yml file. CrateDB uses the PostgreSQL wire protocol on port 5432 by default.
connections:
cratedb:
- name: "cratedb"
username: "crate"
password: ""
host: "localhost"
port: 5432
ssl_mode: "disable"name: The name to identify this CrateDB connectionusername: The CrateDB usernamepassword: The password for the specified usernamehost: The host address of the CrateDB serverport: The port number the database server is listening on. Defaults to5432.ssl_mode: Optional PostgreSQL SSL mode. Supported values aredisable,allow,prefer,require,verify-ca, andverify-full.
For CrateDB Cloud, use the cluster hostname and ssl_mode: "require":
connections:
cratedb:
- name: "cratedb-cloud"
username: "admin"
password: "<PASSWORD>"
host: "<CLUSTERNAME>.eks1.eu-west-1.aws.cratedb.net"
port: 5432
ssl_mode: "require"Step 2: Create an asset file for data ingestion
To ingest data from CrateDB, create an asset configuration file. This file defines the data flow from the source to the destination. Create a YAML file (e.g., cratedb_ingestion.yml) inside the assets folder and add the following content:
name: public.cratedb_summits
type: ingestr
connection: postgres
parameters:
source_connection: cratedb
source_table: "sys.summits"
destination: postgresname: The name of the asset.type: Specifies the type of the asset. Set this toingestrto use the ingestr data pipeline.connection: This is the destination connection, which defines where the data should be stored. For example:postgresindicates that the ingested data will be stored in a Postgres database.source_connection: The name of the CrateDB connection defined in.bruin.yml.source_table: The schema-qualified table in CrateDB that you want to ingest, for examplesys.summits.destination: The destination platform where the data will be ingested.
Step 3: Run asset to ingest data
bruin run assets/cratedb_ingestion.ymlAs a result of this command, Bruin will ingest data from the given CrateDB table into your destination database.
For more information, see the ingestr CrateDB documentation.