I monitor HADR closely, and want to be paged out if HADR goes down. Why? Because any subsequent failure would then be a serious problem. Once HADR has a problem, I suddenly have a single point of failure. That said, for some clients we have to tell our monitoring tools not to alert on HADR during online reorgs. The reason for that is that HADR tends to get into a ‘Congested’ state during online reorgs for certain databases.
Describing the Problem
Essentially what we’re seeing is network congestion. We can see it for one of two major reasons:
- A lot of log files are being generated, and the standby is having trouble keeping up
- An operation that takes a long time such as an offline reorg, but requires relatively few log entries
In the case of online reorgs, I have certainly seen a lot of logging occurring.
This is what the problem looks like when you get HADR status using the db2pd command:
HADR Information: Role State SyncMode HeartBeatsMissed LogGapRunAvg (bytes) Primary Peer Nearsync 0 991669 ConnectStatus ConnectTime Timeout Congested Wed Nov 25 20:31:26 2010 (1283970686) 120
On the primary, DB2 will write the active log file containing the start of the reorg off to disk, so transactions are not impacted there, but on the standby, DB2 cannot do that – it needs to apply whatever is in that log buffer immediately, and if it cannot, it has to wait until the operation in question is finished before moving on.
To prevent the Primary and Standby getting out of sync, DB2 uses this Congested state, and it may actually cause transactions to wait. If, like me, you run a number of reorgs in a window, you may see DB2 going in and out of this state as various tables are reorged. I’m also pretty sure that this doesn’t happen from the start of a reorg to the end of a reorg – there are internal phases or parts of the reorg that may cause this scenario – mostly because I’ve had 5-hour reorgs without hitting a congested state. The level of other activity on the database would probably also play a role.
What are the Effects of the Problem?
When in a ‘Congested’ state, DB2 can prevent transactions from completing on the Primary in SYNC, NEARSYNC, and even ASYNC mode. Only SUPERASYNC mode would be immune to this issue. I tend to run my “online” reorgs at my lowest point of volume anyway to prevent unexpected user impact, so I’ve only heard complaints from automated monitoring, and those even were short lived (less than a couple of minutes). This can all make it pretty frustrating to troubleshoot. Intermittent outages during reorgs of varying duration. Note that just because the status is congested does not mean transactions are being blocked or slowed, but they can be.
Resolving in DB2 10.1 and Later
DB2 10.1 and later have a database configuration parameter – HADR_SPOOL_LIMIT – which you can use to specify how much log you would like to spool to disk on the standby server. It is specified in 4K pages, but there are two special settings which may be useful:
- -1 means unlimited – fill up all the disk space you have in your logging filesystem on your standby
- -2 (AUTOMATIC) means fill up configured log space (may only be available on DB2 10.5)
In DB2 10.1, the default is no log spooling, while in DB2 10.5, the default is AUTOMATIC or -2.
Inevitably a reader at this point is wondering “doesn’t that compromise my ability to failover?”. No, allowing logs to spool to disk does not affect your ability to failover – this just makes the standby behave more like the primary on logging. What it does do is potentially increase the time that a failover would take. You would still get all the data, but you would then have to roll forward through any log files on disk before the database would become available on the standby.
Resolving before DB2 10.1
Before 10.1, there isn’t a full resolution – we can either accept the slowness caused by the congestion, OR we can increase the size of the log buffer on the standby. By default, the log buffer on the standby is two times that on the primary. This is handled by the DB2 registry parameter DB2_HADR_BUF_SZ. The reason we don’t want to control this by actually changing LOGBUFSZ is that in a failover, we’d want a normal log buffer on the (new) primary. This parameter defaults to 2 times the size of the log buffer. Its maximum is supposedly 4 GB, but reading and searching around, I wouldn’t set it over 1 GB.
I learned about this work-around only a couple of weeks ago, and am already using it on at least one client. Another nice thing – this parameter is available all the way back to DB2 8.2, so for those of us with some back-level databases it still works.
Here’s the drawback of the approach though – consider a failure scenario in NEARSYC. If both database servers were to fail at the same moment, AND I couldn’t get the primary database server back, I would lose anything in that larger memory area. Granted, that’s a rare failure scenario, but certainly within the realm of the possible. Unlike the resolution for DB2 10.5, you are changing your recoverability, and you have to decide if the benefits are worth the risks.
A big thanks to Melanie Stopfer @mstopfer1 and Dale McInnis for covering this in their presentations at IBM IOD – this was my biggest immediate-impact technical take-away from that conference.