Wednesday 8 July 2020

Log Volume Full on Secondary or Tertiary Site with System Replication SAP HANA

ERROR:

Receiving alerts that log volume is reaching full
The secondary or tertiary database is going to a hung status

Environment

SAP HANA Database 1.0
SAP HANA Database 2.0

REASON


HANA Log volume can be full due to various reasons. You can refer to SAP Note 2083715 - Analyzing log volume full situations to find more about the cause, but most commonly the secondary system becomes full due to:

Savepoints getting stuck
Bad I/O in secondary/tertiary
Resolution
The purpose of this KBA is not for analyzing the log volume full situation, but to prevent or resolve such situation especially in Secondary/Tertiary systems.

1) First check the log segment status by executing following command on each of the log volume directory:

HOST:/usr/sap/<SID>/SYS/global/hdb/custom/config> hdblogdiag seglist /hana/log/<SID>/mnt000XX/hdb0000X.0000X | grep -i Free | wc -l

This will give you a count of number of free log segments that are in status FREE.

Perform the same operation in other volumes  as well to check the log volume FREE count.



2) You can also run "du" command to obtain which service's log volume has consumed higher disk space:

HOST:/hana/log/<SID>/mnt000XX> du -sh /hana/log/<SID>/mnt000XX/*

This will give the used space in GB and so you can determine which service has consumed higher space by correlating its volume.

3) If you are seeing higher count in hdbindexserver volume, run the corresponding log release command like:

hdbcons 'log release'

* Please note that we cannot connect to HDBSQL or run SQL command in the secondary system. This is the reason we use HDBCONS  utility.

This should reclaim the log space and this operation will take some time and will not finish immediately.

4) If this does not reclaim the space, you will have to either follow SAP Note 1679938 - Log Volume is full  or in worst case disable system replication on both secondary and primary, do a force full replica and enable replication again as per below:

At Secondary site, stop system if its still running:

> sapcontrol -nr <instance_number> -function StopSystem

Perform a full replica from secondary:

> hdbnsutil -sr_register --force_full_replica --remoteHost=<host> --remoteInstance=<Instance number> --replicationMode=sync --name=<secondary site name>

And start secondary system again:

> sapcontrol -nr <instance_number> -function StartSystem

These steps will reinitialize the replication from start and should be followed in the worst case when the secondary is completely stalled.

After when the primary and secondary are up and running fine kindly raise an incident to SAP for root cause analysis to find the reason behind log volume full situation.