Time-based replication using SymmetricDS

Do you need to move data from a read-only source as it changes to a new target? Read-only or time-based replication can be used to achieve this use case. There are positives and negatives, but sometimes this is a necessary option. This blog will review the advantages and show how they can be achieved using SymmetricDS.

When is read-only / time-based replication needed?

Ownership is by a third party. Ability to capture changes through triggers or logs is not possible.
Permissions are reduced to select/read-only access on the source data.
Table structures already designed with a last updated column or only inserts occur on a table.

How are changes captured?

In a read-only / time-based capture scenario there are three options used to stream data from the source to the target.

Timestamp Column
Numeric Primary Key
Data Refresh

1. Timestamp Column

Your table has a timestamp column that represents a last updated time for the row. Each table can be configured with a specific timestamp column to be used. This does not have to be the same column name across all tables.

2. Numeric Primary Key

When your table(s) do not have a last update column to represent changes the primary key may be an option. For tables that only have data inserted, a numeric primary key may be used. If this numeric key always increases it can be used to drive change capture on the table.

2. Data Refresh

The data refresh job is the final approach if a timestamp column or an incremental primary key are not feasible. This approach may not be ideal for large tables, but might be the only option for a read-only source without any other hooks to obtain changes. This job can be scheduled to run independently based on your requirements.

Scheduling?

Tables using a timestamp column or a numeric primary key will run on the same schedule as the log miner job. By default this job will run every 10 seconds, but can be configured to meet your needs. When this job runs, it will store a value for all of the time-based configured tables. By storing the most recent time or the last read primary key value for a given table, the job will pick up where it left off from the previous run to ensure all data is replicated.

The data refresh tables have a dedicated job (called Data Refresh Job) that can be scheduled independently of the timestamp and numeric primary key capture tables. These tables are generally not fully refreshed as often since they send the entire data set for the table rather than incremental changes.

NOTE: The data refresh job will use the new compare and repair component if licensed as a more effective approach to send only changes rather than the full data set. If this is not available a full table reload will be sent.

Setup Time-Based Replication With SymmetricDS

Replication Mode

While going through the add database wizard there is an option to select the capture mode (Trigger, Log, Time-Based). Select time-based as the replication mode.

2. Adding Tables

While adding tables to the replication through the add table wizard, it will provide a new screen for a time-based replication setup. This new screen will analyze all selected tables to automatically map the column to be used for time-based replication.

Results of scanning all selected tables can contain the following options.

Single timestamp column
Multiple timestamp columns (user must select which to use)
Numeric primary key

No timestamp columns and no numeric primary key present (data refresh job will full load this table each time it runs).

3. Data Refresh Job

By default all tables that do not have a column to use for time-based replication will be setup to use the data refresh job. The data refresh job uses the setting on the configure->table routing screen. The data refresh accepts four possible values:

Null – Will not run against this table
Auto – Will determine if the reload or the compare and repair (if licensed) will be most efficient
Differences – Utilizes compare and repair feature (if licensed).
Full – Utilizes a full reload of the table

Verify which tables are setup for data refresh

Change the data refresh option for a table

For a deeper dive into setting up and managing time-based replication with SymmetricDS, reach out for a demo and explore the advanced features available to enhance your replication strategy.

Read-Only Replication (Time-Based Replication)