Copyright © 2007-2020 JumpMind, Inc

Version 3.12.x-SNAPSHOT

Permission to use, copy, modify, and distribute this SymmetricDS User Guide for any purpose and without fee is hereby granted in perpetuity, provided that the above copyright notice and this paragraph appear in all copies.

Preface

This user guide introduces SymmetricDS and its features for data synchronization. It is intended for users, developers, and administrators who want to install the software, configure synchronization, and manage its operation. Thank you to all the members of the open source community whose feedback and contributions helped us build better software and documentation. This version of the guide was generated on 2020-02-04.

1. Introduction

SymmetricDS is open source software for database and file synchronization, with support for multi-master replication, filtered synchronization, and transformation. It uses web and database technologies to replicate change data as a scheduled or near real-time operation, and it includes an initial load feature for full data loads. The software was designed to scale for a large number of nodes, work across low-bandwidth connections, and withstand periods of network outage.

1.1. System Requirements

SymmetricDS is written in Java and requires a Java Runtime Environment (JRE) Standard Edition (SE) or Java Development Kit (JDK) Standard Edition (SE) version 8.0 or above. Most major operating systems and databases are supported. See the list of supported databases in the Database Compatibility section. The minimum operating system requirements are:

  • Java SE Runtime Environment 8 or above

  • Memory - 64 (MB) available

  • Disk - 256 (MB) available

The memory, disk, and CPU requirements increase with the number of connected clients and the amount of data being synchronized. The best way to size a server is to simulate synchronization in a lower environment and benchmark data loading. However, a rule of thumb for servers is one server-class CPU with 2 GB of memory for every 500 MB/hour of data transfer and 350 clients. Multiple servers can be used as a cluster behind a load balancer to achieve better performance and availability.

SymmetricDS Pro is accessed from a web console, which requires one of the following supported web browsers:

  • Google Chrome 23 or newer

  • Internet Explorer 8 or newer

  • Mozilla Firefox 17 or newer

  • Safari 6 or newer

1.2. Overview

A node is responsible for synchronizing the data from a database or file system with other nodes in the network using HTTP. Nodes are assigned to one of the node Groups that are configured together as a unit. The node groups are linked together with Group Links to define either a push or pull communication. A pull causes one node to connect with other nodes and request changes that are waiting, while a push causes one node to connect with other nodes when it has changes to send.

Each node is connected to a database with a Java Database Connectivity (JDBC) driver using a connection URL, username, and password. While nodes can be separated across wide area networks, the database a node is connected to should be located nearby on a local area network for the best performance. Using its database connection, a node creates tables as a Data Model for configuration settings and runtime operations. The user populates configuration tables to define the synchronization and the runtime tables capture changes and track activity. The tables to sync can be located in any Catalog and Schema that are accessible from the connection, while the files to sync can be located in any directory that is accessible on the local server.

overview

At startup, SymmetricDS looks for Node Properties Files and starts a node for each file it finds, which allows multiple nodes to run in the same instance and share resources. The property file for a node contains its external ID, node group, registration server URL, and database connection information. The external ID is the name for a node used to identify it from other nodes. One node is configured as the registration server where the master configuration is stored. When a node is started for the first time, it contacts the registration server using a registration process that sends its external ID and node group. In response, the node receives its configuration and a node password that must be sent as authentication during synchronization with other nodes.

1.3. Architecture

Each subsystem in the node is responsible for part of the data movement and is controlled through configuration. Data flows through the system in the following steps:

  1. Capture into a runtime table at the source database

  2. Route for delivery to target nodes and group into batches

  3. Extract and transform into the rows, columns, and values needed for the outgoing batch

  4. Send the outgoing batch to target nodes

  5. Receive the incoming batch at the target node

  6. Transform into the rows, columns, and values needed for the incoming batch

  7. Load data and return an acknowledgment to the source node

architecture
Capture

Change Data Capture (CDC) for tables uses database triggers that fire and record changes as comma-separated values into a runtime table called DATA. For file sync, a similar mechanism is used, except changes to the metadata about files are captured. The changes are recorded as insert, update, and delete event types. The subsystem installs and maintains triggers on tables based on the configuration provided by the user, and it can automatically detect schema changes on tables and regenerate triggers.

Route

Routers run across new changes to determine which target nodes will receive the data. The user configures which routers to use and what criteria is used to match data, creating subsets of rows if needed. Changes are grouped into batches and assigned to target nodes in the DATA_EVENT and OUTGOING_BATCH tables.

Extract

Changes are extracted from the runtime tables and prepared to be sent as an outgoing batch. If large objects are configured for streaming instead of capture, they are queried from the table. Special event types like "reload" for Initial Loads are also processed.

Transform

If transformations are configured, they operate on the change data either during the extract phase at the source node or the load phase at the target node. The node’s database can be queried to enhance the data. Data is transformed into the tables, rows, columns, and values needed for either the outgoing or incoming batch.

Outgoing

The synchronization sends batches to target nodes to be loaded. Multiple batches can be configured to send during a single synchronization. The status of the batch is updated on the OUTGOING_BATCH table as it processes. An acknowledgment is received from target nodes and recorded on the batch.

Incoming

The synchronization receives batches from remote nodes and the data is loaded. The status of the batch is updated on the INCOMING_BATCH table as it processes. The resulting status of the batch is returned to the source node in an acknowledgment.

1.4. Features

SymmetricDS offers a rich set of features with flexible configuration for large scale deployment in a mixed environment with multiple systems.

  • Web UI - The web console provides easy configuration, management, and troubleshooting.

  • Data Synchronization - Change data capture for relational databases and file synchronization for file systems can be periodic or near real-time, with an initial load feature to fully populate a node.

  • Central Management - Configure, monitor, and troubleshoot synchronization from a central location where conflicts and errors can be investigated and resolved.

  • Automatic Recovery - Data delivery is durable and low maintenance, withstanding periods of downtime and automatically recovering from a network outage.

  • Secure and Efficient - Communication uses a data protocol designed for low bandwidth networks and streamed over HTTPS for encrypted transfer.

  • Transformation - Manipulate data at multiple points to filter, subset, translate, merge, and enrich the data.

  • Conflict Management - Enforce consistency of two-way synchronization by configuring rules for automatic and manual resolution.

  • Extendable - Scripts and Java code can be configured to handle events, transform data, and create customized behavior.

  • Deployment Options - The software can be installed as a self-contained server that stands alone, deployed to a web application server, or embedded within an application.

1.5. Why SymmetricDS?

SymmetricDS is a feature-rich data synchronization solution that focuses on ease of use, openness, and flexibility. The software encourages interoperability and accessibility for users and developers with the availability of source code, an application programming interface (API), and a data model supported by documentation. Configuration includes a powerful set of options to define node topology, communication direction, transformation of data, and integration with external systems. Through scripts and Java code, the user can also extend functionality with custom behavior. With a central database for setup and runtime information, the user has one place to configure, manage, and troubleshoot synchronization, with changes taking immediate effect across the network.

The trigger-based data capture system is easy to understand and widely supported by database systems. Table synchronization can be setup by users and application developers without requiring a database administrator to modify the server. Triggers are database objects written in a procedural language, so they are open for examination, and include flexible configuration options for conditions and customization. Some overhead is associated with triggers, but they perform well for applications of online transaction processing, and their benefits of flexibility and maintenance outweigh the cost for most scenarios.

Using an architecture based on web server technology, many simultaneous requests can be handled at a central server, with proven deployments in production supporting more than ten thousand client nodes. Large networks of nodes can be grouped into tiers for more control and efficiency, with each group synchronizing data to the next tier. Data loading is durable and reliable by tracking batches in transactions and retrying of faults for automatic recovery, making it a low maintenance system.

1.6. License

SymmetricDS Pro is commercial software that is licensed, not sold. It is subject to the terms of the End User License Agreement (EULA) and any accompanying JumpMind Support Contract. See the standard SymmetricDS Pro license for reference, but your agreement with JumpMind may be different.

2. Installation

SymmetricDS at its core is a web application. A SymmetricDS instance runs within the context of a web application container like Jetty or Tomcat, and uses web based protocols like HTTP to communicate with other instances.

An instance has one of the following installation options:

  1. Standalone Installation - SymmetricDS is installed and run as a standalone process using the built-in Jetty web server. This is the simplest and recommended way to install an instance.

  2. Web Archive (WAR) - A SymmetricDS web archive (WAR) file is deployed to an existing web application container that is separately installed, maintained and run.

  3. Embedded - SymmetricDS is embedded within an existing application. In this option, a custom wrapper program is written that calls the SymmetricDS API to synchronize data.

2.1. Standalone Installation

The SymmetricDS Pro setup program is an executable JAR file that can run on any system with a Java Runtime Environment (JRE). See System Requirements for prerequisites. Download the setup program from SymmetricDS Pro Downloads.

Run the setup program:

  • From a desktop environment, double click the symmetric-pro-<version>-setup.jar file

  • If double clicking doesn’t work, use a command prompt to run: java -jar symmetric-pro-<version>-setup.jar

  • From a text-based environment, use a terminal to run: java -jar symmetric-pro-<version>-setup.jar -console

install1

The first screen shows the SymmetricDS Pro software version. The setup program will ask a series of questions before writing files to disk.

To begin selecting options, click Next.

install2

Carefully read the SymmetricDS Pro License Agreement.

If you accept, select I accept the terms of this license agreement and click Next.

install3

Specify Install new software to install a new version of SymmetricDS for the first time.

For upgrading an existing installation of SymmetricDS, see Upgrading.

Click Next to continue.

install4

Choose the installation path where SymmetricDS will either be installed or upgraded. If the directory does not already exist, it will be created for you. Make sure your user has permission to write to the file system.

After entering the directory path, click Next.

install5

Select the packages you want to install and verify disk space requirements are met. By default, all packages are selected. Drivers for popular databases are included, but they can be unselected if you don’t plan to use them.

After selecting packages, click Next.

install6

SymmetricDS can either be run automatically by the system or manually by the user. Select the Install service to run automatically checkbox to install a Windows service or Unix daemon that will start SymmetricDS when the computer is restarted. The service can installed or uninstalled later using the Control Center or command line (see Running as a Service).

Select the Run server after installing checkbox to also run SymmetricDS after installation so it can be used immediately.

After selecting options, click Next.

install7

HTTPS and HTTPS/2 protocols are recommended for protecting data security. For testing without security or encryption, the HTTP protocol can be enabled. Choose an available port number to listen on, which will be validated.

Java Management eXtension (JMX) is an optional way to manage the server from third party tools like JConsole. Most installations leave it disabled and use the web console for management.

Click Next to continue.

install8

Specify how much memory to use for sending and receive data changes. More memory is needed to communicate with multiple clients and when data contains large objects (LOB). Estimate an extra 5 MB of memory for each client and each 500 MB/hour of data transfer.

Click Next to continue.

install9

Specify disk space options for temporarily staging incoming and outgoing data changes. Using staging helps the overall performance of the system and minimizes use of the database. The default location is the "tmp" sub-directory of the installation directory. For Clustering, specify a common network share.

Click Next to continue.

install10

Confirm your installation settings look correct.

Click Next to begin installing files.

install11

The packages you selected are installed to disk.

After it finishes, click Next.

install12

During the finish step, it will install the service and start the service if you selected those options.

After it finishes, click Next.

install13

The installation is now complete. Choose if you want to open the SymmetricDS Pro Control Center where you can view the server status and open a web console.

Click Done to exit the setup program.

install14

From the SymmetricDS Pro Control Center, you can start/stop the server, open the web console, and install/uninstall the service.

To begin configuration of SymmetricDS, check that the server is running, and then click Open Web Console.

To continue setup and configuration of SymmetricDS, refer to the Setup section.

2.2. Running as a Service

SymmetricDS can be configured to start automatically when the system boots, running as a Windows service or Linux/Unix daemon. A wrapper process starts SymmetricDS and monitors it, so it can be restarted if it runs out of memory or exits unexpectedly. The wrapper writes standard output and standard error to the logs/wrapper.log file.

For SymmetricDS Pro, you may have already installed as a service, so this section will show you how to manually install the service from command line.

2.2.1. Running as a Windows Service

To install the service, run the following command as Administrator:

bin\sym_service.bat install

Most configuration changes do not require the service to be re-installed. To uninstall the service, run the following command as Administrator:

bin\sym_service.bat uninstall

To start and stop the service manually, run the following commands as Administrator:

bin\sym_service.bat start
bin\sym_service.bat stop

2.2.2. Running as a Linux/Unix daemon

An init script is written to the system /etc/init.d directory. Symbolic links are created for starting on run levels 2, 3, and 5 and stopping on run levels 0, 1, and 6. To install the script, running the following command as root:

bin/sym_service install

Most configuration changes do not require the service to be re-installed. To uninstall the service, run the following command as root:

bin/sym_service uninstall

To start and stop the service manually, run the following commands:

bin/sym_service start
bin/sym_service stop

2.3. Clustering

A single SymmetricDS node may be deployed across a series of servers to cooperate as a cluster. A node can be clustered to provide load balancing and high availability.

When using clustering, a hardware load balancer is typically used, but a software load balancer, such as a reverse proxy, can also be used.

For clustered nodes running SymmetricDS 3.8 and later, the recommended approach is to configure the load balancer to use sticky sessions and ensure the staging directory for all nodes in the cluster are using a shared network drive. Sticky sessions are needed to support reservation requests, which allows for nodes to connect and obtain a reservation before connecting again and pushing their changes. The shared staging directory is needed to support extract in background of the initial load, which is extracted by one node, but served by different nodes in the cluster. If the start.initial.load.extract.job property is disabled, then shared staging is not required, but the performance of the initial load may be degraded.

For clustered nodes running SymmetricDS 3.7 and earlier, it is recommended to round robin client requests to the cluster and configure the load balancer for stateless connections.

Also, the sync.url (discussed in Registration URL) SymmetricDS property should be set to the URL of the load balancer.

If the cluster will be running any of the SymmetricDS jobs, then the cluster.lock.enabled property should be set to true. By setting this property to true, SymmetricDS will use a row in the LOCK table as a semaphore to make sure that only one instance at a time runs a job. When a lock is acquired, a row is updated in the lock table with the time of the lock and the server id of the locking job. The lock time is set back to null when the job is finished running. Another instance of SymmetricDS cannot acquire a lock until the locking instance (according to the server id) releases the lock. If an instance is terminated while the lock is still held, an instance with the same server id is allowed to reacquire the lock. If the locking instance remains down, the lock can be broken after a period of time, specified by the cluster.lock.timeout.ms property, has expired. Note that if the job is still running and the lock expires, two jobs could be running at the same time which could cause database deadlocks.

By default, the locking server id is the hostname of the server. If two clustered instances are running on the same server, then the cluster.server.id property may be set to indicate the name that the instance should use for its server id.

When deploying SymmetricDS to an application server like Tomcat or JBoss, no special session clustering needs to be configured for the application server.

2.4. Other Deployment Options

It is recommended that SymmetricDS is installed as a standalone service, however there are two other deployment options.

2.4.1. Web Archive (WAR)

This option means packaging a WAR file and deploying to your favorite web server, like Apache Tomcat. It’s a little more work, but you can configure the web server to do whatever you need. SymmetricDS can also be embedded in an existing web application, if desired. As a web application archive, a WAR is deployed to an application server, such as Tomcat, Jetty, or JBoss. The structure of the archive will have a web.xml file in the WEB-INF folder, an appropriately configured symmetric.properties file in the WEB-INF/classes folder, and the required JAR files in the WEB-INF/lib folder.

symmetric war
Figure 1. War

A war file can be generated using the standalone installation’s symadmin utility and the create-war subcommand. The command requires the name of the war file to generate. It essentially packages up the web directory, the conf directory and includes an optional properties file. Note that if a properties file is included, it will be copied to WEB-INF/classes/symmetric.properties. This is the same location conf/symmetric.properties would have been copied to. The generated war distribution uses the same web.xml as the standalone deployment.

bin/symadmin -p my-symmetric-ds.properties create-war /some/path/to/symmetric-ds.war

2.4.2. Embedded

This option means you must write a wrapper Java program that runs SymmetricDS. You would probably use Jetty web server, which is also embeddable. You could bring up an embedded database like Derby or H2. You could configure the web server, database, or SymmetricDS to do whatever you needed, but it’s also the most work of the three options discussed thus far.

The deployment model you choose depends on how much flexibility you need versus how easy you want it to be. Both Jetty and Tomcat are excellent, scalable web servers that compete with each other and have great performance. Most people choose either the Standalone or Web Archive with Tomcat 5.5 or 6. Deploying to Tomcat is a good middle-of-the-road decision that requires a little more work for more flexibility.

A Java application with the SymmetricDS Java Archive (JAR) library on its classpath can use the SymmetricWebServer to start the server.

import org.jumpmind.symmetric.SymmetricWebServer;

public class StartSymmetricEngine {

    public static void main(String[] args) throws Exception {

        SymmetricWebServer node = new SymmetricWebServer(
                                   "classpath://my-application.properties", "conf/web_dir");

        // this will create the database, sync triggers, start jobs running
        node.start(8080);

        // this will stop the node
        node.stop();
    }

This example starts the SymmetricDS server on port 8080. The configuration properties file, my-application.properties, is packaged in the application to provide properties that override the SymmetricDS default values. The second parameter to the constructor points to the web directory. The default location is web. In this example the web directory is located at conf/web_dir. The web.xml is expected to be found at conf/web_dir/WEB-INF/web.xml.

3. Setup

Once the SymmetricDS software is installed on a computer and an instance of it is running, the next step in setting up the synchronization scenario is to set up a SymmetricDS node within that running SymmetricDS instance. As a reminder, a SymmetricDS node is connected to a database or file system and is responsible for synchronizing that database’s data to other SymmetricDS nodes within the node network.

3.1. Node Type

When opening the SymmetricDS Pro web console, if there are no SymmetricDS nodes defined within the running SymmetricDS instance, the Node Setup Wizard will be displayed to guide you through the process of creating one. There are two types of SymmetricDS nodes:

  1. Master Node - The master node is typically the first node created when defining a synchronization scenario and serves as the central configuration point for the synchronization scenario. All SymmetricDS configuration is stored in the database this node is attached to. This node is also typically the registration server for the synchronization scenario, accepting registration requests from other nodes and providing them the SymmetricDS configuration they need.

  2. Node - All other nodes are simply designated nodes. They have all of the same responsibilities as a Master Node as far as being responsible for synchronizing a particular database or file system’s data, but have NO responsibilities for configuration or registration.

In addition to the two standard node types above, there is also an option to create a Demo installation. This Demo installation does the following:

  1. Creates two fully populated H2 example databases (a server and a client) to be synchronized

  2. Creates a SymmetricDS Master node within the running SymmetricDS instance attached to the server database

  3. Creates a second Node within the running SymmetricDS instance attached to the client database

  4. Configures SymmetricDS to synchronize the two sample databases

  5. Randomly executes inser