MySQL
MySQL is an open source relational database management system, known for its speed and reliability.
ingestr supports MySQL as a source and a destination.
URI format
The URI format for MySQL is as follows:
mysql://user:password@host:port/dbnameURI parameters:
user: the user name to connect to the databasepassword: the password for the userhost: the host address of the database serverport: the port number the database server is listening on, the default is 3306dbname: the name of the database to connect to
The same URI structure and table can be used both for sources and destinations. You can read more about SQLAlchemy's MySQL dialect here.
Change data capture
MySQL CDC is supported with the mysql+cdc://, mysql+pymysql+cdc://, and mariadb+cdc:// URI schemes. It reads a consistent snapshot first, then resumes from the destination table's maximum _cdc_lsn on subsequent runs.
If the saved _cdc_lsn is invalid or no longer available in MySQL binary logs, the run fails instead of taking a partial snapshot. Run with --full-refresh to rebuild the destination from a fresh snapshot.
Incremental CDC runs use the merge strategy even if --incremental-strategy=replace is supplied, so updates and deletes can be applied by primary key. Use --full-refresh to rebuild the destination from a fresh snapshot.
Example:
ingestr ingest \
--source-uri "mysql+cdc://user:password@host:3306/dbname?mode=batch&server_id=18888" \
--dest-uri "sqlite:///tmp/mysql_cdc.db" \
--source-table "orders" \
--dest-table "orders"Requirements:
- Binary logging must be enabled with
log_bin=ON. binlog_formatmust beROW.binlog_row_imagemust beFULL.binlog_row_value_optionsmust not includePARTIAL_JSON.- Source tables must have primary keys, or
--primary-keymust be provided. - Source tables must not contain
ENUM,SET, orBITcolumns. - The source user needs normal read access, permission to run
FLUSH TABLES WITH READ LOCKfor the initial snapshot, and replication privileges required to stream binary logs.
CDC URI parameters:
mode:batch; defaults tobatch.server_id: optional positive uint32 replication server id; generated automatically when omitted. Pin a unique value for scheduled or overlapping CDC runs.dest_schema: optional destination schema for multi-table CDC runs.flavor:mysqlormariadb; inferred from the URI scheme unless overridden.
Multi-table CDC snapshots each selected table independently and then stream each table from its own snapshot position. Each table is consistent on its own, but a multi-table run is not a single global point-in-time snapshot across all tables.