The configuration file

The file config-example.yaml is stored in ~/.pg_chameleon/configuration and should be used as template for the other configuration files. The configuration consists of three configuration groups.

Global settings

1# global settings
2pid_dir: '~/.pg_chameleon/pid/'
3log_dir: '~/.pg_chameleon/logs/'
4log_dest: file
5log_level: info
6log_days_keep: 10
7rollbar_key: ''
8rollbar_env: ''
  • pid_dir directory where the process pids are saved.

  • log_dir directory where the logs are stored.

  • log_dest log destination. stdout for debugging purposes, file for the normal activity.

  • log_level logging verbosity. allowed values are debug, info, warning, error.

  • log_days_keep configure the retention in days for the daily rotate replica logs.

  • rollbar_key: the optional rollbar key

  • rollbar_env: the optional rollbar environment

If both rollbar_key and rollbar_env are configured some messages are sent to the rollbar conf

type override

1# type_override allows the user to override the default type conversion
2# into a different one.
3
4type_override:
5  "tinyint(1)":
6    override_to: boolean

The type_override allows the user to override the default type conversion into a different one. Each type key should be named exactly like the mysql type to override including the dimensions. Each type key needs two subkeys.

  • override_to specifies the destination type which must be a postgresql type and the type cast should be possible

  • override_tables is a yaml list which specifies to which tables the override applies. If the first list item is set to “*” then the override is applied to all tables in the replicated schemas.

The override is applied when running the init_replica,refresh_schema andsync_tables process. The override is also applied for each matching DDL (create table/alter table) if the table name matches the override_tables values.

PostgreSQL target connection

1# postgres  destination connection
2pg_conn:
3  host: "localhost"
4  port: "5432"
5  user: "usr_replica"
6  password: "never_commit_password"
7  database: "db_replica"
8  charset: "utf8"

The pg_conn key maps the target database connection string.

sources configuration

 1sources:
 2  mysql:
 3    db_conn:
 4      host: "localhost"
 5      port: "3306"
 6      user: "usr_replica"
 7      password: "never_commit_passwords"
 8      charset: 'utf8'
 9      connect_timeout: 10
10    schema_mappings:
11      delphis_mediterranea: loxodonta_africana
12    limit_tables:
13      - delphis_mediterranea.foo
14    skip_tables:
15      - delphis_mediterranea.bar
16    grant_select_to:
17      - usr_readonly
18    lock_timeout: "120s"
19    my_server_id: 100
20    replica_batch_size: 10000
21    replay_max_rows: 10000
22    batch_retention: '1 day'
23    copy_max_memory: "300M"
24    copy_mode: 'file'
25    out_dir: /tmp
26    sleep_loop: 1
27    on_error_replay: continue
28    on_error_read: continue
29    auto_maintenance: "disabled"
30    gtid_enable: false
31    type: mysql
32    skip_events:
33      insert:
34        - delphis_mediterranea.foo  # skips inserts on delphis_mediterranea.foo
35      delete:
36        - delphis_mediterranea  # skips deletes on schema delphis_mediterranea
37      update:
38    keep_existing_schema: No
39
40  pgsql:
41    db_conn:
42      host: "localhost"
43      port: "5432"
44      user: "usr_replica"
45      password: "never_commit_passwords"
46      database: "db_replica"
47      charset: 'utf8'
48      connect_timeout: 10
49      schema_mappings:
50        loxodonta_africana: elephas_maximus
51      limit_tables:
52        - loxodonta_africana.foo
53      skip_tables:
54        - loxodonta_africana.bar
55      copy_max_memory: "300M"
56      grant_select_to:
57        - usr_readonly
58      lock_timeout: "10s"
59      my_server_id: 100
60      replica_batch_size: 3000
61      replay_max_rows: 10000
62      sleep_loop: 5
63      batch_retention: '1 day'
64      copy_mode: 'file'
65      out_dir: /tmp
66      type: pgsql

The key sources allow to setup multiple replica sources writing on the same postgresql database. The key name myst be unique within the replica configuration.

The following remarks apply only to the mysql source type.

For the postgresql source type. See the last section for the description and the limitations.

Database connection

 1sources:
 2  mysql:
 3    db_conn:
 4      host: "localhost"
 5      port: "3306"
 6      user: "usr_replica"
 7      password: "never_commit_passwords"
 8      charset: 'utf8'
 9      connect_timeout: 10
10    schema_mappings:
11      delphis_mediterranea: loxodonta_africana
12    limit_tables:
13      - delphis_mediterranea.foo
14    skip_tables:
15      - delphis_mediterranea.bar
16    grant_select_to:
17      - usr_readonly
18    lock_timeout: "120s"
19    my_server_id: 100
20    replica_batch_size: 10000
21    replay_max_rows: 10000
22    batch_retention: '1 day'
23    copy_max_memory: "300M"
24    copy_mode: 'file'
25    out_dir: /tmp
26    sleep_loop: 1
27    on_error_replay: continue
28    on_error_read: continue
29    auto_maintenance: "disabled"
30    gtid_enable: false
31    type: mysql
32    skip_events:
33      insert:
34        - delphis_mediterranea.foo  # skips inserts on delphis_mediterranea.foo
35      delete:
36        - delphis_mediterranea  # skips deletes on schema delphis_mediterranea
37      update:
38    keep_existing_schema: No

The db_conn key maps the target database connection string. Within the connection is possible to configure the connect_timeout which is 10 seconds by default. Larger values could help the tool working better on slow networks. Low values can cause the connection to fail before any action is performed.

Schema mappings

 1sources:
 2  mysql:
 3    db_conn:
 4      host: "localhost"
 5      port: "3306"
 6      user: "usr_replica"
 7      password: "never_commit_passwords"
 8      charset: 'utf8'
 9      connect_timeout: 10
10    schema_mappings:
11      delphis_mediterranea: loxodonta_africana
12    limit_tables:
13      - delphis_mediterranea.foo
14    skip_tables:
15      - delphis_mediterranea.bar
16    grant_select_to:
17      - usr_readonly
18    lock_timeout: "120s"
19    my_server_id: 100
20    replica_batch_size: 10000
21    replay_max_rows: 10000
22    batch_retention: '1 day'
23    copy_max_memory: "300M"
24    copy_mode: 'file'
25    out_dir: /tmp
26    sleep_loop: 1
27    on_error_replay: continue
28    on_error_read: continue
29    auto_maintenance: "disabled"
30    gtid_enable: false
31    type: mysql
32    skip_events:
33      insert:
34        - delphis_mediterranea.foo  # skips inserts on delphis_mediterranea.foo
35      delete:
36        - delphis_mediterranea  # skips deletes on schema delphis_mediterranea
37      update:
38    keep_existing_schema: No

The key schema mappings is a dictionary. Each key is a MySQL database that needs to be replicated in PostgreSQL. Each value is the destination schema in the PostgreSQL database. In the example provided the MySQL database delphis_mediterranea is replicated into the schema loxodonta_africana stored in the database specified in the pg_conn key (db_replica).

Limit and skip tables

 1sources:
 2  mysql:
 3    db_conn:
 4      host: "localhost"
 5      port: "3306"
 6      user: "usr_replica"
 7      password: "never_commit_passwords"
 8      charset: 'utf8'
 9      connect_timeout: 10
10    schema_mappings:
11      delphis_mediterranea: loxodonta_africana
12    limit_tables:
13      - delphis_mediterranea.foo
14    skip_tables:
15      - delphis_mediterranea.bar
16    grant_select_to:
17      - usr_readonly
18    lock_timeout: "120s"
19    my_server_id: 100
20    replica_batch_size: 10000
21    replay_max_rows: 10000
22    batch_retention: '1 day'
23    copy_max_memory: "300M"
24    copy_mode: 'file'
25    out_dir: /tmp
26    sleep_loop: 1
27    on_error_replay: continue
28    on_error_read: continue
29    auto_maintenance: "disabled"
30    gtid_enable: false
31    type: mysql
32    skip_events:
33      insert:
34        - delphis_mediterranea.foo  # skips inserts on delphis_mediterranea.foo
35      delete:
36        - delphis_mediterranea  # skips deletes on schema delphis_mediterranea
37      update:
38    keep_existing_schema: No
  • limit_tables list with the tables to replicate. If the list is empty then the entire mysql database is replicated.

  • skip_tables list with the tables to exclude from the replica.

The table’s names should be in the form SCHEMA_NAME.TABLE_NAME.

Grant select to option

 1sources:
 2  mysql:
 3    db_conn:
 4      host: "localhost"
 5      port: "3306"
 6      user: "usr_replica"
 7      password: "never_commit_passwords"
 8      charset: 'utf8'
 9      connect_timeout: 10
10    schema_mappings:
11      delphis_mediterranea: loxodonta_africana
12    limit_tables:
13      - delphis_mediterranea.foo
14    skip_tables:
15      - delphis_mediterranea.bar
16    grant_select_to:
17      - usr_readonly
18    lock_timeout: "120s"
19    my_server_id: 100
20    replica_batch_size: 10000
21    replay_max_rows: 10000
22    batch_retention: '1 day'
23    copy_max_memory: "300M"
24    copy_mode: 'file'
25    out_dir: /tmp
26    sleep_loop: 1
27    on_error_replay: continue
28    on_error_read: continue
29    auto_maintenance: "disabled"
30    gtid_enable: false
31    type: mysql
32    skip_events:
33      insert:
34        - delphis_mediterranea.foo  # skips inserts on delphis_mediterranea.foo
35      delete:
36        - delphis_mediterranea  # skips deletes on schema delphis_mediterranea
37      update:
38    keep_existing_schema: No

This key allows to specify a list of database roles which will get select access on the replicate tables.

Source configuration parameters

 1sources:
 2  mysql:
 3    db_conn:
 4      host: "localhost"
 5      port: "3306"
 6      user: "usr_replica"
 7      password: "never_commit_passwords"
 8      charset: 'utf8'
 9      connect_timeout: 10
10    schema_mappings:
11      delphis_mediterranea: loxodonta_africana
12    limit_tables:
13      - delphis_mediterranea.foo
14    skip_tables:
15      - delphis_mediterranea.bar
16    grant_select_to:
17      - usr_readonly
18    lock_timeout: "120s"
19    my_server_id: 100
20    replica_batch_size: 10000
21    replay_max_rows: 10000
22    batch_retention: '1 day'
23    copy_max_memory: "300M"
24    copy_mode: 'file'
25    out_dir: /tmp
26    sleep_loop: 1
27    on_error_replay: continue
28    on_error_read: continue
29    auto_maintenance: "disabled"
30    gtid_enable: false
31    type: mysql
32    skip_events:
33      insert:
34        - delphis_mediterranea.foo  # skips inserts on delphis_mediterranea.foo
35      delete:
36        - delphis_mediterranea  # skips deletes on schema delphis_mediterranea
37      update:
38    keep_existing_schema: No
  • lock_timeout the max time in seconds that the target postgresql connections should wait for acquiring a lock. This parameter applies to init_replica,refresh_schema and sync_tables when performing the relation’s swap.

  • my_server_id the server id for the mysql replica. must be unique within the replica cluster

  • replica_batch_size the max number of rows that are pulled from the mysql replica before a write on the postgresql database is performed. See caveats in README for a complete explanation.

  • batch_retention the max retention for the replayed batches rows in t_replica_batch. The field accepts any valid interval accepted by PostgreSQL

  • copy_max_memory the max amount of memory to use when copying the table in PostgreSQL. Is possible to specify the value in (k)ilobytes, (M)egabytes, (G)igabytes adding the suffix (e.g. 300M).

  • copy_mode the allowed values are ‘file’ and ‘direct’. With direct the copy happens on the fly. With file the table is first dumped in a csv file then reloaded in PostgreSQL.

  • out_dir the directory where the csv files are dumped during the init_replica process if the copy mode is file.

  • sleep_loop seconds between a two replica batches.

  • on_error_replay specifies whether the replay process should exit or continue if any error during the replay happens. If continue is specified the offending tables are removed from the replica.

  • on_error_read specifies whether the read process should exit or continue if a connection error during the read process happens. If continue is specified the process emits a warning and waits for the connection to come back. If the parameter is omitted the default is exit which cause the replica process to stop with error.

  • auto_maintenance specifies the timeout after an automatic maintenance is triggered. The parameter accepts values valid for the PostgreSQL interval data type (e.g. 1 day). If the value is set to disabled the automatic maintenance doesn’t run. If the parameter is omitted the default is disabled.

  • gtid_enable (EXPERIMENTAL) Specifies whether to use the gtid to auto position the replica stream. This parameter have effect only on MySQL and only if the server is configured with the GTID.

  • type specifies the source database type. The system supports mysql or pgsql. See below for the pgsql limitations.

Skip events configuration

 1sources:
 2  mysql:
 3    db_conn:
 4      host: "localhost"
 5      port: "3306"
 6      user: "usr_replica"
 7      password: "never_commit_passwords"
 8      charset: 'utf8'
 9      connect_timeout: 10
10    schema_mappings:
11      delphis_mediterranea: loxodonta_africana
12    limit_tables:
13      - delphis_mediterranea.foo
14    skip_tables:
15      - delphis_mediterranea.bar
16    grant_select_to:
17      - usr_readonly
18    lock_timeout: "120s"
19    my_server_id: 100
20    replica_batch_size: 10000
21    replay_max_rows: 10000
22    batch_retention: '1 day'
23    copy_max_memory: "300M"
24    copy_mode: 'file'
25    out_dir: /tmp
26    sleep_loop: 1
27    on_error_replay: continue
28    on_error_read: continue
29    auto_maintenance: "disabled"
30    gtid_enable: false
31    type: mysql
32    skip_events:
33      insert:
34        - delphis_mediterranea.foo  # skips inserts on delphis_mediterranea.foo
35      delete:
36        - delphis_mediterranea  # skips deletes on schema delphis_mediterranea
37      update:
38    keep_existing_schema: No

The skip_events variable allows to tell pg_chameleon to skip events for tables or entire schemas. The example provided with configuration-example.ym disables the inserts on the table delphis_mediterranea.foo and disables the deletes on the entire schema delphis_mediterranea.

Keep existing schema

 1sources:
 2  mysql:
 3    db_conn:
 4      host: "localhost"
 5      port: "3306"
 6      user: "usr_replica"
 7      password: "never_commit_passwords"
 8      charset: 'utf8'
 9      connect_timeout: 10
10    schema_mappings:
11      delphis_mediterranea: loxodonta_africana
12    limit_tables:
13      - delphis_mediterranea.foo
14    skip_tables:
15      - delphis_mediterranea.bar
16    grant_select_to:
17      - usr_readonly
18    lock_timeout: "120s"
19    my_server_id: 100
20    replica_batch_size: 10000
21    replay_max_rows: 10000
22    batch_retention: '1 day'
23    copy_max_memory: "300M"
24    copy_mode: 'file'
25    out_dir: /tmp
26    sleep_loop: 1
27    on_error_replay: continue
28    on_error_read: continue
29    auto_maintenance: "disabled"
30    gtid_enable: false
31    type: mysql
32    skip_events:
33      insert:
34        - delphis_mediterranea.foo  # skips inserts on delphis_mediterranea.foo
35      delete:
36        - delphis_mediterranea  # skips deletes on schema delphis_mediterranea
37      update:
38    keep_existing_schema: No

When set to Yes init_replica,refresh_schema and sync_tables do not recreate the affected tables using the data from the MySQL source.

Instead the existing tables are truncated and the data is reloaded. A REINDEX TABLE is executed in order to have the indices in good shape after the reload.

When keep_existing_schema is set to Yes the parameter grant_select_to have no effect.

PostgreSQL source type (EXPERIMENTAL)

pg_chameleon 2.0 has an experimental support for the postgresql source type. When set to pgsql the system expects a postgresql source database rather a mysql. The following limitations apply.

  • There is no support for real time replica

  • The data copy happens always with file method

  • The copy_max_memory doesn’t apply

  • The type override doesn’t apply

  • Only init_replica is currently supported

  • The source connection string requires a database name

  • In the show_status detailed command the replicated tables counters are always zero

 1  pgsql:
 2    db_conn:
 3      host: "localhost"
 4      port: "5432"
 5      user: "usr_replica"
 6      password: "never_commit_passwords"
 7      database: "db_replica"
 8      charset: 'utf8'
 9      connect_timeout: 10
10      schema_mappings:
11        loxodonta_africana: elephas_maximus
12      limit_tables:
13        - loxodonta_africana.foo
14      skip_tables:
15        - loxodonta_africana.bar
16      copy_max_memory: "300M"
17      grant_select_to:
18        - usr_readonly
19      lock_timeout: "10s"
20      my_server_id: 100
21      replica_batch_size: 3000
22      replay_max_rows: 10000
23      sleep_loop: 5
24      batch_retention: '1 day'
25      copy_mode: 'file'
26      out_dir: /tmp
27      type: pgsql

Fillfactor

The dictionary fillfactor is used to set the fillfactor for tables that are expected to work with large updates. The key name defines the fillfactor level (The allowed values range is 10 to 100). If key name is set to “*” then the fillfactor applies to all tables in the replicated schema. If the table appears multiple times, then only the last matched value will be applied

1fillfactor:
2  "30":
3    tables: 
4      - "foo.bar"
5  "40":
6    tables: 
7      - "foo.foobar"