Since the beginning of the SymmetricDS project, change data capture has been supported on Oracle and Oracle RAC. That’s because Oracle is a popular database management system, ranking #1 on the DB-Engines Ranking for years. Recently, I helped deploy SymmetricDS in a unique application environment based on Oracle RAC, where it was challenged to keep up with bulk, concurrent changes. By adjusting some new parameters specifically designed for RAC, we were able to meet performance requirements.
Nodes and Throughput
Let’s start with some background on SymmetricDS and Oracle RAC. A common use case for SymmetricDS is capturing changes from many remote databases and replicating them in real time to a central database. For example, replicating retail stores, water vessels, sports stadiums, oil rigs, kiosks, law offices, or medical devices to regional and central databases. This use case works well because it leverages proven web server technology at the central server to manage incoming requests and load data changes into the database. On the other hand, a use case with a low number of databases, but the same high throughput can be more challenging and need additional tuning.
How Change Data Capture Works
Change data capture with SymmetricDS works by using triggers to capture changes to a common log table called SYM_DATA. Each change gets a unique sequence ID and transaction ID to preserve order and atomicity. In order words, it guarantees that changes will be loaded to the target system in the same order and transaction. Each database platform has a feature to generate the unique ID for a change event. On Oracle, it uses a separate object in the database, aptly called a sequence, that hands out the next ID.
What is Oracle RAC?
Oracle Real Application Clusters (RAC) is a cluster of multiple interconnected servers that appear as if they are one database to the application and end user. The RAC nodes have a private, high-speed network that connects them so they can share information quickly and reliably. It is a “shared everything” architecture, meaning each node has access to all data using a network file system that is cluster aware. Some operations on RAC are coordinated between nodes. For example, if a query requires data that is cached on another node in the RAC, it is transferred over the interconnect instead of reading from disk.
Unique Retail Deployment
In the deployment I helped with, the retailer has remote locations that connect directly to an Oracle RAC database over a private WAN. There are actually several Oracle RAC databases, each one servicing a different channel partner’s remote locations. SymmetricDS is being used to consolidate all the databases into a single multi-channel database in real time to support cross-store returns and reporting. The challenge came during the end of day processing, with large groups of stores in each time zone reconciling and posting their sales summaries. Overhead from change capture slowed the process down enough that the application received a time out.
Oracle provides a nice performance gathering and reporting tool called Automatic Workload Repository (AWR). From the AWR report, the bottleneck showed up as the sequence for SYM_DATA. On Oracle, SymmetricDS configures the sequence to be ordered and cached in memory. On a standalone Oracle database, this is the fastest setting, but on Oracle RAC, it incurs cluster overhead to coordinate the next ordered value.
The solution for better SymmetricDS performance on Oracle RAC is using a NOORDER sequence. To achieve that, edit the engine file, set the parameter for oracle.sequence.noorder=true, and restart the service. Now, each RAC node can independently use its cache of sequences much quicker. SymmetricDS adjusts its routing and extract to order by time instead. It also uses the gv$_sequences view to follow the gaps created by each RAC node, so as not to expire them. (Gaps are a tricky topic to get into, but just know that a gap is when there are numbers missing in the sequence.) Not all Oracle DBAs like to grant access to gv$_sequences view, so another parameter oracle.sequence.noorder.nextvalue.db.urls can be set with a list of database URLs to each RAC node instead, where a heartbeat is recorded to keep gaps from expiring.
With the parameters set, we easily processed end of day for all locations! Now you know a couple of tuning tricks for change data capture on Oracle RAC. As always, let us know what we can do to help with your project.