Integrating SQL Server CDC with Oracle: A Guide

Part 1

·

5 min read

Integrating SQL Server CDC with Oracle: A Guide

As you may know, Change Data Capture (CDC) is not a new feature for MS SQL Server. CDC is a technology used to track and capture changes made to data in a database.

In summary, CDC:

  1. Monitors and records changes in a database.

  2. Captures insert, update, and delete operations.

  3. Facilitates data synchronization and integration.

  4. Stored data in table format.

  5. Table level capture.

  6. Captured with Log Sequence Numbers (LSN).

  7. Enhances data warehousing and ETL workflow.

However, not many people know that MS SQL offers a CDC feature for Oracle called Attunity CDC for Oracle. This allows most workloads and setups to be hosted on MS SQL instead of Oracle. It also means users don't need to pay extra for the costly Oracle GoldenGate.

Please don't get me wrong, Oracle GoldenGate is a comprehensive software for real-time data integration and replication. It just requires more configuration and a license to operate.


Attunity CDC for Oracle

Attunity CDC for Oracle acts as a bridge connecting Oracle and SQL, instead of using a Linked server. It leverages the SQL Server 2012+ CDC feature to capture changed data from Oracle Logging Base. You can test the CDC feature between SQL and SQL; the operations are the similar as between SQL and Oracle.

The bridge includes a CDC Designer GUI tool to monitor and operate the CDC service and instance for Oracle. This GUI helps you automatically build the CDC database and tables.

Your main tasks involve using CDC Flow Components on SSIS to create an Initial Load for each table (CDC captures the changed data not the entire data). Then, create an Incremental Load to transfer the changed data to the destination tables.

CDC Structure

A Change Data Capture (CDC) table typically includes metadata columns that help track changes in the source table. Here is an example of what a CDC table might look like:

$operation$$seqval$$start_lsn$$end_lsn$$update_mask$Column1Column2Column3...
10x0000000000010x000000000001NULL0x0001Value1Value2Value3...
20x0000000000020x000000000002NULL0x0002Value4Value5Value6...
30x0000000000030x000000000003NULL0x0004Value7Value8Value9...

Explanation of Columns:

  • __$operation$: Indicates the type of operation (1 = delete, 2 = insert, 3 = update).

  • __$seqval$: A sequence value that helps order the changes.

  • __$start_lsn$: The Log Sequence Number (LSN) at the start of the change.

  • __$end_lsn$: The LSN at the end of the change (NULL if the change is still ongoing).

  • __$update_mask$: A bitmask indicating which columns were updated.

  • Column1, Column2, Column3, ...: The actual data columns from the source table.

This structure allows for efficient tracking and querying of changes made to the source table.

Note
Each table will have its own Change Data Table located under SQL System Tables. CDC Designer continuously reads LCN (SCN on Oracle). If there are any changes (Insert, Update, Delete), the values will be recorded in its own change table suffixed with cdc.xxxxx_CT. By default, the data will be wiped out every 3 days. Essentially, it will create two jobs called cdc.xxx_cleanup and cdc.xxx_capture. For some reason, the second job won’t be created automatically, so I have to create it manually.

Prerequisites for Installation

  • Turn Oracle Database to Archive-Log mode.

  • Install Oracle Client & CDC for Oracle by Attunity, both x86 & x64. They are located in SQL ISO location.

    • Tools\AttunityCDCOracle

      • x64 contains the 64-bit components

      • x86 contains the 32-bit components

  • Understand how CDC Flow Components works.

  • User Roles (will post in another article)

  • User accounts:

    • One for SQL Engine/Agent; one for CDC Windows Service (it will not be used for Oracle or SQL).

    • CDC service master password is required for each change data database. This key is used to encrypt the password of the Oracle log mining account.

    • One Oracle log mining user.

    • One Oracle DBA user for configuring settings.


Installation

Open SQL for Oracle CDC client

  • Enter SQL Server name

  • It will create a database named "MSXDBCDC" that manages the CDC service.

  • Account requires a SQL server system administrator(sysadmin).

  • Scripts can be found in my Github.com

Open Oracle CDC Service

  • Service name: any name is fine.

  • Service account: account for running Windows Oracle CDC service. Windows domain account is recommended.

  • Associated SQL Server Instance:

    • SQL authentication account is recommeded.

    • Account should be granted the public and dbcreator roles.

  • CDC master password: It is required for each change data database. This key is used to encrypt the password of the Oracle log mining account or if you move the CDC database.

Create Oracle CDC Instance

It will create a database (not the source or destination database) for each CDC instance (e.g., OraCDC).

Connect to Oracle source Database

  • Enter the connection string for the Oracle source database.

Select Oracle Tables

It requires an Oracle DBA role to execute a supplemental logging script on the Oracle side remotely. This can be done manually later if needed. It prepares mirror tables, and if it fails, it will roll back all actions.

Here is the code example for this step:

BEGIN
EXECUTE IMMEDIATE ('ALTER TABLE "SCHEMA"."OBJECTNAME" LOGGING');
DECLARE GroupExists NUMBER;
BEGIN
SELECT COUNT(*) INTO GroupExists FROM DBA_LOG_GROUPS WHERE LOG_GROUP_NAME = 'MS_OBJECTNAME319eabd7_00' AND OWNER = 'SCHEMA' AND TABLE_NAME = 'OBJECTNAME' ;
IF GroupExists = 1 THEN
EXECUTE IMMEDIATE ('ALTER TABLE "SCHEMA"."OBJECTNAME" DROP SUPPLEMENTAL LOG GROUP "MS_OBJECTNAME319eabd7_00"');
END IF;
EXECUTE IMMEDIATE ('ALTER TABLE "SCHEMA"."OBJECTNAME" ADD SUPPLEMENTAL LOG GROUP "MS_OBJECTNAME319eabd7_00" ("OBJECTNAME_PID","OBJECTNAME_EMPLOYEE_CODE","OBJECTNAME_SEQNO","OBJECTNAME_FROM_DATE","OBJECTNAME_TO_DATE") ALWAYS');
END;
END;
/

Now that you have finished setting up the SQL Instance and CDC Service, the next step is to operate a CDC instance.


Operating a CDC instance

The detailed status is set to STOP by default.

Start the CDC instance.

The status will change from INITIAL to IDLE to PROCESSING.