performance

  • Common Batches Just Got Better

    Common batch mode was enhanced in SymmetricDS 3.11 to allow any group of nodes to share batches. Let's look at what common batches are and why this is a big deal for database replication with more than a couple nodes.

    Before Common

    In the early history of SymmetricDS, it didn't include the concept of common mode for batches. The router service would put changes into separate batches for each node that needed them. For example, if there were 4 nodes on the system, and the router service found 3 changed rows of data, it would write a new batch for each node and place the changes inside. The screenshot below shows an example, with batches 58 through 61.

    Outgoing batch screen showing a non-common batch

    Common Batches

    Starting in version 3.0, it was optimized to route the change once into a common batch for all the nodes. There is still a separate batch entry to track the status for each node, but they share the same batch number and the same batch data. The screenshot below shows batch 63 is shared.

    Outgoing batch screen showing a common batch

    Sharing is Better

    Why is sharing batches better? Because common batches are more efficient to route, extract, and purge. We can see the difference when looking at the runtime model. Changes are captured as "Data" and assigned to one or more "Batches". The many-to-many relationship between the two is associated in "Data Events". With common batches, the router service has less work to create batches and the purge service has less work to clean up later. The illustration below shows the difference when routing 3 changes to 4 nodes.

    Common versus non-common batch

    Limitations Lifted

    Common mode has been around, but it was limited to channels with only "default" routers and uni-directional replication. In other words, it worked for a channel when sending one-way to all nodes, but not for any arbitrary group of nodes. With the new SymmetricDS 3.11 release, any group of nodes are eligible for a common batch. In other words, now common batch mode is always enabled! Routing to groups of nodes to subset data by chain, region, district, and any other combination now benefit from common mode sharing of batches.

     

     

  • Load Balancing SymmetricDS Pro

    For most of our large clients where SymmetricDS is playing a mission critical role in their organization, our customers take advantage of SymmetricDS Pro's out of the box capabilities to be load balanced. While SymmetricDS does a great job with hands off recovery in cases of hardware or infrastructure failure, most clients want as little downtime as possible. They simply don't want a single point of failure in their synchronization scenario. The good news is, with SymmetricDS Pro, load balancing is simple and straight forward.

    The Scenario

    In many of these scenarios, SymmetricDS Pro is synchronizing a large server database with thousands of client databases. Most people are ok with the client databases being at risk of downtime due to client infrastructure, but the server is a different story. We see most people load balance the server by running two or more instances of SymmetricDS Pro on separate hardware, pointed to the same server database. Traffic is directed to one of the load balanced servers via some type of load balancer (mostly hardware based appliances).

    Configuration Background

    In order to understand the setup, it's helpful to understand the configuration that determines how SymmetricDS Pro nodes communicate with each other. In a typical synchronization scenario, one or more of the SymmetricDS Pro nodes are designated as the registration or root servers. These servers listen on a given port and URL for clients to request registration to the synchronization scenario. The port and URL they listen on are configured by setting the sync.url property on the server. This property is initially set during SymmetricDS Pro server setup, and can also be found by clicking "Configure", "Parameters" from the Pro Management Console, or by looking at the SymmetricDS properties file that gets written in the "engines" subdirectory of your installation.

    In our scenario from above, the three Symmetric Pro Servers on the right are the registration servers.

    On the client side, the clients need to know the URL to connect to in order to attempt registration to the synchronization scenario. This is the URL from above on which the registration servers are listening. On the client side, this is called the registration url and is setup during the client install, and can also be found by clicking "Configure", "Parameters" from the Pro Management Console, or by looking at the SymmetricDS properties file for the client.

    Once the client attempts registration, the server sends the client it's sync.url property, so the client knows where to reach out to the server for subsequent synchronization requests and work. The registration URL is only used during the registration process, not for go forward synchronization. So, the client uses the registration url to know where to go to register, and then after that uses the sync.url sent from the server during the registration process for all go forward synchronization work.

    Load Balancing Configuration

    Knowing that background, setting up the load balanced setup from above is pretty simple. Here are the steps:

    • During Server setup, when specifying Communication Settings, instead of using the default sync URL, click the radio button that says "Define URL setting for load balancer or proxy." Then, type in the Hostname and Port for the load balancer itself. Behind the scenes, this sets the sync.url for the server to the hostname and port (URL) for the load balancer.
    • Set your load balancer to forward requests to any number of SymmetricDS Pro root servers, forwarding them to the server name and port on which SymmetricDS Pro is running on the individual servers.
    • During the client setup, when specifying the Registration URL, specify the URL from above that points to the load balancer. This will get written to the .properties file as the registration.url for the client.

    The Result

    The result of the configuration is as follows:

    • On the Servers, the Hostname and Port that you configured for the load balancer gets written to the server.properties file as the sync.url.
    • On the Clients, the Registration.URL that you specified gets written to the client .properties file.
    • When the Clients register, they point to the load balancer, which passes off the request to one of the three SymmetricDS Pro servers.
    • The server in turn sends its sync.url (which also points to the load balancer) back down to the client to tell it where to make future synchronization requests.
    • On subsequent synchronization requests from the client to the server, it looks at the sync.url that was given to it by the server (stored in sym_node table on the client). This url points to the load balancer which forwards the request to one of the actual Symmetric Servers.

    Next Up

    In the next couple of weeks, we'll add another couple of articles that discuss the following:

    • Now that I have SymmetricDS Pro load balanced, how do the servers distribute the work?
    • How do the SymmetricDS Pro servers stay out of each other's way, and know who's doing what?
    • We'll also show you how to use the open source apache http server as a load balancer.

    Stay tuned for more!

  • Load Testing of Data Replication

    Load testing of database replication finds the upper limit of how well the system can perform, and it provides assurance that replication will make it through times of peak usage. Let's look at how to simulate production activity for SymmetricDS data replication in a lower environment so you can deploy with confidence.

    About the Load Test

    Our load test will orchestrate many simulated SymmetricDS clients syncing data to an actual SymmetricDS server. We'll start with a simple batch of data on the "heartbeat" channel that updates the sym_node_host table. The test framework allows additional batches of data on other channels to be added, so it can be customized to use actual data from an environment. The clients are simulated using The Grinder load testing framework.

    Build the Project

    Clone or download the SymmetricDS Load Test project from GitHub. It requires the Gradle build tool is installed on your system.

    First, build the agent and install it to a directory. Each agent will run our tests in the background, using multiple threads and processes. Multiple agents can also be installed across different machines to scale out resources.

    $ gradle assemble
    $ unzip build/distributions/symmetric-loadtest.zip -d ~/
    $ mv ~/symmetric-loadtest ~/loadtest-agent
    

    Second, build the console and install it to the directory. You only need one console to control multiple agents.

    $ gradle assemble -Pconsole=true
    $ unzip build/distributions/symmetric-loadtest.zip -d ~/
    $ mv ~/symmetric-loadtest ~/loadtest-console
    

    Running

    Run the console, which opens a graphical application to control the agents and the load test.

    $ cd ~/loadtest-console/bin
    $ ./symmetric-loadtest
    

    Run each agent, which can be run as a background process. Repeat this step for each agent you installed.

    $ cd ~/loadtest-agent/bin
    $ ./symmetric-loadtest
    

    Each agent will connect to the console. Under the "Processes" tab of the console, you should see each agent listed there.

    Setup the Tests

    In the console, go to the "Script" tab. Here are the files included:

    • loadtest.py - A Python script that contains a "pull" and "push" test
    • loadtest.properties - Configuration file that controls how the tests will behave
    • heartbeat.csv - Example batch of data to replicate for the "heartbeat" channel

    The Grinder console edit loadtest.properties

    Edit the "loadtest.properties" file. Either specify the number of runs (0 means run forever) or how long the test should run. This example runs each test 50 times or for 2 minutes, which ever comes first:

    grinder.runs=50
    grinder.duration=120000
    

    Tell each agent how many threads to use and how often to run each test. Tests run simultaneously across all the threads. This example has each agent using 3 threads, and each thread is running tests every 10 seconds.

    grinder.threads=3
    time.between.sync.ms=1000
    

    Configure unique node IDs for each agent to use. This example gives three node IDs to each of the three agents.

    locations.agent.id.0.process.id.0=5,6,7
    locations.agent.id.1.process.id.0=8,9,10
    locations.agent.id.2.process.id.0=11,12,13
    

    The SymmetricDS server needs these nodes to be configured in its database. Run SQL on its database to create each node and set its password. The "server.auth.token" property should match the node_password field of the sym_node_security table.

    insert into sym_node (node_id, node_group_id, external_id, sync_enabled, created_at_node_id)
    values ('5', 'client', '5', 1, 'server');
    insert into sym_node_security (node_id, node_password, registration_time, created_at_node_id)
    values ('5', 'test', current_timestamp, 'server');
    

    Configure the test to target a SymmetricDS server. It needs to know the node ID and sync URL of the target server that the clients will connect to.

    target.node.id=server
    server.url=http://localhost:31415
    server.path=/sync/server-central
    

    Perform the Load Test

    After configuring the test and saving the file, select Distribute -> Distribute Files from the menu to send the configuration out to all the agents. To perform the load testing, select Action -> Start Processes. Now switch to the "Results" tab to watch statistics.

    The Grinder console edit loadtest.properties

    As the database replication runs, the tests per second (TPS) and batches will increase. When a test runs, it may not successful sync a batch of data. The SymmetricDS server can reject data synchronization requests with a "too busy" response, when the number of requests exceeds its http.concurrent.workers.max parameter. The parameter can be adjusted to control an acceptable amount of load on the database.

    Going Further

    We've simulated a load test for SymmetricDS database replication that included "heartbeat" batches of data. Now you can add more channels and batches to the test by configuring them in the "loadtest.properties" file. To add batch files from an actual environment, copy them from the SymmetricDS staging directory (under the "tmp" sub-directory) or save them from the Manage -> Outgoing Batches screen. You can also run your batch processes on the server's database to simulate changes that the clients will pull and acknowledge. Now you have a working load test harness to build a simulated production load.

  • SymmetricDS 2.2.0 Released

    SymmetricDS 2.2.0 is a minor release that brings 15 bug fixes, 8 improvements, and 5 new features.  Some highlights include:

    • The incoming data loader was changed to use a single connection to eliminate contention with a nested transaction and reduce the connection pool requirements. 
    • The purge job that cleans up the runtime tables is now three separate jobs that can be scheduled independently.
    • A new query plan is being used to route and batch data changes that is more efficient.

    This release also includes a new project organization and reduced library dependencies in preparation for small footprint clients.  We've created a new Mobile Development forum to discuss plans for an upcoming mobile implementation.

    For the full release notes, click through to read more.