Skip to end of metadata
Go to start of metadata

 

In this page, we can know how to find when the system replication is registered and when the takeover is started from HANA traces.
To analyze system replication issues, we always need to know the timestamp when the issue happened(Issues with register, issues with takeover). 

 

1. Register :

In secondary SYSTEMDB nameserver trace, nameserver_<hostname>.00000.001.trc (the nameserver.00000 trace records all the commands that are executed on HANA). 

There will be command describes when the secondary system is registered. We can search the kerword "hdbnsutil -sr_register" to locate the timestamp.

[18992]{-1}[-1/-1] 2018-05-15 06:32:26.651250 i Basis TraceStream.cpp(00608) : ==== Starting hdbnsutil, version 2.00.024.00.1520347815 (fa/hana2sp02), build linuxx86_64 cbbe109e04fad8d084436a2884d31f77c1e233d7 2018-03-06 16:28:38 ld4554 gcc (SAP release 20170307, based on SUSE gcc6-6.2.1+r239768-2.4) 6.2.1 20160826 [gcc-6-branch revision 239773]

[18992]{-1}[-1/-1] 2018-05-15 06:32:26.651291 i Basis ProcessorInfo.cpp(00798) : Using GDT segment limit to determine current CPU ID

[18992]{-1}[-1/-1] 2018-05-15 06:32:26.651293 i Basis Timer.cpp(00660) : Using RDTSC for HR timer

[18992]{-1}[-1/-1] 2018-05-15 06:32:26.661010 i Basis TopologyUtil.cpp(03742) : command: hdbnsutil -sr_register --name=SITEB --remoteHost=mo-1de3745d7 --remoteInstance=0 --replicationMode=sync --operationMode=logreplay --force_full_replica --sapcontrol=1

[18992]{-1}[-1/-1] 2018-05-15 06:32:26.661966 i Basis TopologyUtil.cpp(03768) : called by user 'hreadm' with UID: 1002 (parent process command line '/bin/sh /usr/sap/HRE/HDB00/HDBSettings.sh hdbnsutil -sr_register --name=SITEB --remoteHost=mo-1de3745d7 --remoteInstance=0 --replicationMode=sync --operationMode=logreplay --force_full_replica --sapcontrol=1 ' with PID: 18973) (parent process executable /bin/bash')

[18992]{-1}[-1/-1] 2018-05-15 06:32:26.711156 i sr_nameserver DRClient.cpp(00286) : Remove takeover file: /usr/sap/HRE/SYS/global/hdb/sr_takeover.sem

[18992]{-1}[-1/-1] 2018-05-15 06:32:26.715587 i sr_nameserver DRClientUtils.cpp(00458) : drReadAndSerializeSystemPKISSFSStoreAndKeyChecksum: SystemPKI-SSFS MD5 requested

[18992]{-1}[-1/-1] 2018-05-15 06:32:26.715746 i sr_nameserver DRClientUtils.cpp(00474) : drReadAndSerializeSystemPKISSFSStoreAndKeyChecksum: systemPKI-SSFS data size=2960,2960, MD5=3490537006-570804619-2819018709-158454443

[18992]{-1}[-1/-1] 2018-05-15 06:32:26.715796 i sr_nameserver DRClientUtils.cpp(00480) : drReadAndSerializeSystemPKISSFSStoreAndKeyChecksum: systemPKI-SSFS key size=187,187, MD5=1691635447-2072228501-1565580830-815000704

[18992]{-1}[-1/-1] 2018-05-15 06:32:26.980813 i Basis TopologyUtil.cpp(04025) : done.

2. Takeover :

Secondary SYSTEMDB nameserver trace(If it is multitenant DB, find the nameserver traces in SYSTEMDB): 

In nameserver_<second_hostname>.00000.000.trc, we cand find the take over command, and when it was triggerred, We can search the kerword "hdbnsutil -sr_takeover" to locate the timestamp.

[13478]{-1}[-1/-1] 2018-12-03 06:52:58.729936 i Basis TopologyUtil.cpp(04425) : command: hdbnsutil -sr_takeover
[13478]{-1}[-1/-1] 2018-12-03 06:52:58.730332 i Basis TopologyUtil.cpp(04451) : called by user 'hkradm' with UID: 1000 (parent process command line '-sh ' with PID: 12553) (parent process executable /bin/bash')


In nameserver_<second_hostname>31201.000.trc, we can see when the take over finished

[10048]{-1}[-1/-1] 2018-12-03 06:52:58.776940 i sr_nameserver DRClientUtils.cpp(00510) : Write phase '1' to takeover file: /usr/sap/HKR/SYS/global/hdb/sr_takeover.sem
[10048]{-1}[-1/-1] 2018-12-03 06:52:58.776976 i sr_nameserver DRRequestHandler.cpp(00413) : issuing a normal takeover
[10048]{-1}[-1/-1] 2018-12-03 06:52:58.777018 i sr_nameserver TREXNameServer.cpp(16333) : Load topology for online takeover
[10048]{-1}[-1/-1] 2018-12-03 06:52:58.777111 i PersistenceManag DisasterRecoverySecondaryImpl.cpp(01074) : Prepare takeover started (operationMode=logreplay) ..
[10048]{-1}[-1/-1] 2018-12-03 06:52:58.777126 i PersistenceManag DisasterRecoverySecondaryImpl.cpp(01086) : Takeover: invalidating connections

...

[13512]{-1}[-1/-1] 2018-12-03 06:53:30.891777 e sr_nameserver TREXNameServer.cpp(04278) : "/usr/sap/HKR/SYS/global/hdb/install/bin/hdbupdrep -s HKR --user_store_key=SRTAKEOVER -b" failed; exitState=0; exitCode=256
[13512]{-1}[-1/-1] 2018-12-03 06:53:30.891803 i sr_nameserver TREXNameServer.cpp(04254) : issueing "/usr/sap/HKR/SYS/global/hdb/install/bin/hdbupdrep -s HKR --user_store_key=SRTAKEOVER -b" (assertUpdate=0; try 5)
[13512]{-1}[-1/-1] 2018-12-03 06:53:36.900884 e sr_nameserver TREXNameServer.cpp(04278) : "/usr/sap/HKR/SYS/global/hdb/install/bin/hdbupdrep -s HKR --user_store_key=SRTAKEOVER -b" failed; exitState=0; exitCode=256
[13512]{-1}[-1/-1] 2018-12-03 06:53:36.900907 i sr_nameserver TREXNameServer.cpp(04287) : finished successfully "/usr/sap/HKR/SYS/global/hdb/install/bin/hdbupdrep -s HKR --user_store_key=SRTAKEOVER -b"
[13512]{-1}[-1/-1] 2018-12-03 06:53:36.907954 i sr_nameserver TREXNameServer.cpp(04407) : Takeover finished successfully

 

  • No labels