What is this "Table Comparison" and Where is it used?
The so-called "table comparison" is a functionality based on SAPuptool to verify if the number of rows been exported is the same as the number of rows in the database.
There is currently only one scenario released where it is officially used: during the Database Migration Option (DMO) used in the Software Update Manager (SUM).
In SUM versions since Mid of 2016 the Table Checker is no longer required with DMO as it performs what is called a Count* which integrates and enhances the Table Checker functionality.
More details can be found in the SAP KBA
2348804 - Count* replaces Table Checker for DMO use cases and how to use it for Audit Purposes.
How to switch it on?
This decision can be made in phase MIG2NDDB_INI (module PREP_INPUT).
=========== Database Migration Option ===========
Enter the target database type:
01) - No migration
02) * SAP HANA DATABASE
TYPE: SAP HANA
=========== Migration Parameters ===========
Select tables for table comparison checks.
01) - Do not compare table contents
02) * Operate on all tables
03) - Operate on selected tables only
TABLES: Operate on all tables
Relevant phasenames in SUM
First thing to know is, that the migration (DMO) is separated into „uptime-migration“ (repository tables) and „downtime-migration“ (the „big“ thing).
During „uptime-migration“ the new shadow repository (~ tables) is copied to the target database. In the „downtime-migration“ all the application tables will be converted to the target database.
Relevant phases are:
How does it work?
Besides the "real" content check, there is always a count(*) done after each table migration to ensure the same row number in source/target table. The migration won't continue unless counters match!
The "real" content check (called table comparison) is based on CRC's (cyclic redundancy checksums):
Initially it generates CRC checksums for blocks of e. g. 1000 rows.
(Table is traversed in primary key order to calculate reproducable checksums. The block size is increased automatically if table appears large to avoid large number of generated checksums.)
Explanations to the above picture:
these are the migration logs (1) for one single table (2), the left side contains the EXPORT, the right side the IMPORT
- it's composed of 3 colums: number of rows per block (3.1),checksum (3.2), last key in this block (3.3)
differences to drill down are shown in red
1st red line: EXPORT -> crc: 1000 0xa6f4640b 'BC325X30', IMPORT -> crc: 1001 0x340ed60e 'BC325X30'
-> there is not only a difference in the checksum, but also in the number rows in the block
2nd/3rd red line: there is a difference only in the checksums, means the row on the target table is different to the source table
4th red line: again number of rows and checksum is different
Repeatedly it „drills down“ into blocks where differences have been detected (non-matching row count or differing checksum).
(It decrease the blocksize repeatedly by factor of 50 until single rows are reached. There is also a limit of maximum numbers of single row differences to protect against systematic errors.)
Explanations to the above picture:
- This is the continuation of the first error: EXPORT -> crc: 1000 0xa6f4640b 'BC325X30', IMPORT -> crc: 1001 0x340ed60e 'BC325X30'
- (1) this is the last of the previous block, mentioned here just to know where to start
(2) the number of rows per block is drilled down to 20 to converge the different row
(3) a difference in rowcount and checksum has been found in a block of 20 rows.
There are 19 rows in export with key range "ABAPTREE" +1 to "ACL_ACTGRE", but 20 rows at import side. Therefor the CRC is different as well.
(4) new drill down to 1 rows per block, now each row is scanned separately
(5) a difference in rowcount and checksum has been found. There was a single row on the export side, but there are 2 rows in the same key area on the import side.
There is one row with key "ACC_INPH3_DPR" in export, but 2 rows in the same key range ("ABDOCUSUBJECTS" +1 to "ACC_INPH3_DPR") on the import side.
(6) a last drill down shows the row with the key which is identical on export/import and the row which is missing in export but was added during import.
A row with key "ABTREE" has been inserted in the target database table.
- To achieve a high parallelism there is a automatic distribution of CRC tasks e.g. by generating splitted "Where" conditions for large tables, of grouping into buckets, and/or grouping and distributing of CRC tasks into buckets.
Here is an example on how a "Where" condition is built (based on the Primary key)
A typical error message could look like this:
with following meanings of KEY:
D -> this key has been added to the target table
U -> this key has been updated/changed in the target table
I -> this key is missing in the target table
Syntax to compare an EXPORT and an IMPORT log e.g. using SDIFF command (on Unix):
example: sdiff -w200 migrate_ut_crc/MIGRATE_UT_CRC_CHK_00014_REPOSRC_EXP.CRC migrate_ut_crc/MIGRATE_UT_CRC_CHK_00014_REPOSRC_IMP.CRC | less -S
It is also possible to use an external editor like Notepad++, ...