IBM has published a document with some additional best practices for configuring TSAMP, so I thought I would add an article to my TSA series covering these settings.
My TSAMP series is what I am most often recognized for, I think, and one that I most often refer people to. Check out some of my past articles here: https://datageek.blog/?s=TSA
I asked IBM why db2haicu does not cover these, and db2cktsa also does not cover them. I’ve been told that db2cktsa is no longer being updated, so it is generally not useful. I did not get a good answer as to why these are not included automatically in the work that db2haicu does.
The technote covers the following settings, in greater detail than I have covered here. Please note that all of these must be done as root.
- Relax Heartbeat Sensitivity settings Determine your cluster’s “CommGroup Name” by issuing the lscomg command.Then change the setting for sensitivity with this command:
chcomg -s 4 -p 4
Apply the change to all configured communication groups listed in the lscomg output.
- Set CT_MANAGEMENT_SCOPE=2 for all users by adding the following to the instance owner’s .profile or .bash_profile or .bashrc and that of any SYSADM users or others who may start or stop the instance or TSAMP:
- Change CritRsrcProtMethod setting from 1 to 3 using this command:
chrsrc -c IBM.PeerNode CritRsrcProtMethod=3
- Create a netmon.cf file on each clustered server, using these commands:
touch /var/ct/cfg/netmon.cf echo [ip_on_local_subnet] > /var/ct/cfg/netmon.cf
- Enable effective Syslog logging. Configure your /etc/syslog.conf to enable logging for the following facility and priorities into a single file which will catch all syslog messages regardless of source, using a line like this:
*.debug /var/log/syslog.out rotate time 1d files 14
This would rotate the file every day and keep the last 14 files (two weeks of historical data) and delete everything older. Check with your sysadmin to make sure this matches their strategy, and refer to this technote for more info: http://www-01.ibm.com/support/docview.wss?uid=swg21675952
- Keep an updated copy of getsadata on hand. You can find it here: http://www.ibm.com/support/docview.wss?rs=820&uid=swg21285496
Every server I build TSAMP on gets a copy of this, and I also add it in when doing health checks, now.
- Set your HADR_TIMEOUT and HADR_PEER_WINDOW. I’m actually a bit surprised that IBM had to state this one, but you get really strange failover issues if you don’t have these set properly for your network. If you have an iron-clad, rock-solid network, set these lower. If your network is a bit more flaky and prone to problems, set them higher. Absent tight RTO objectives, the defaults of 120 and 300 work just fine. Used these commands to set them:
db2 update db cfg for
using HADR_TIMEOUT NNN db2 update db cfg for using HADR_PEER_WINDOW NNN
- Have a second network adapter on each server participating in heartbeating
- Ensure that the Cluster Manager (CLUSTER_MGR) parameter in the dbm config is set to TSA on both/all cluster nodes when automation is enabled. If this isn’t correct, the only way to change it is through db2haicu – a disable/enable should do it. This cannot be manually set any more.