Syncing Sales Data with High Throughput

Professional services company syncs sales data to a data warehouse with high throughput.

Syncing Sales Data with High Throughput

Headquarters
Menlo Park, California

Industry
Professional Services

Number of Nodes
2

This S&P 500 professional services company needed to integrate existing systems by synchronizing sales data into a data warehouse. Consolidating sales data from more than 400 locations worldwide, the business wanted a central database where they could use business intelligence tools to analyze the raw data with data mining, analytics, and reporting.

With existing systems deployed in the cloud across machines in Amazon Web Services, communication needed to be secure, make efficient use of the network, and recover automatically from outages. It also needed to be easy to access and manage remotely. Their current ETL (extract, transform, load) tool could move data between systems with bulk operations, but they needed to respond quickly to structural changes and handle deletes on the source system, both of which required too much overhead and complexity to accomplish with ETL. Instead, they decided that data replication would be a simpler, easier to maintain solution.

The technical staff looked for a data replication solution with cross platform support for multiple databases. They preferred open source solutions for their more modular, flexible, and interoperable design. Other qualities they wanted in data sync software were a light weight architecture, easy configuration, efficiency, and overall good design. SymmetricDS Pro was selected as the database replication server to sync data from the PostgreSQL operational database to the Oracle data warehouse.

The technical team deployed SymmetricDS Pro to their AWS testing environment and configured it to replicate various aspects of the sales data. The functionality was a good fit for their use case, but as they began running stress tests, there were concerns about meeting production-sized data loads within the short time frame needed by their application.

JumpMind was engaged to review the situation. “They were bulk loading data in a short time frame, while disabling channels to control the flow, which was causing a unique challenge for the replication,” said Mark Michalek, Software Engineer at JumpMind. “We started a collaboration where the customer would send us snapshots of the system running a test, and we would send back parameter changes to optimize it,” said Michalek. Through the analysis, the development team also discovered an opportunity in the code for further optimization, and prepared a performance patch for the customer to install.

The teamwork paid off, and the customer achieved the data replication requirements of the project. “Whenever working with a new tool, one of the concerns is always the level of support available,” said Larry Rutledge, Senior Software Engineer for the customer’s technical team. “JumpMind has far exceeded my expectations in both responding to my questions and partnering with me to enhance the product based on my experience. I have been extremely impressed with JumpMind and all the engineers I’ve dealt with and look forward to a long term relationship,” said Rutledge.