I have been through a number of IP address or host name changes over the years, and thought I’d share the lessons learned. I’m specifically talking about changes to the IP address or hostname of a DB2 server, and the related changes within DB2. This post focuses on steps for Linux/UNIX systems.
IP Address Changes
If you have cataloged databases using host names, there may actually be very little to change when a database server’s IP address changes. I prefer to use a host name when cataloging remote databases. I require WebSphere Commerce and any application servers with full DB2 clients on them to use host names. If you’re using host names, then an IP address change has less impact. In fact, if you are not using TSA and you set up HADR (if any) with host names, you may not have to make any changes at all. You will, however, have to verify that your /etc/hosts file on the database server is also appropriately changed. An SA(System Administrator) may do this work, but it never hurts to check. See the “Hosts File” section below.
Before the change even occurs, first check to see if any loopback database catalog entries and any entries on DB2 clients that access this server use the IP address. If they don’t, then no work is needed in re-cataloging. If they do, then choose between altering them to use the hostname or coordinating with the IP address change to make the catalog entry changes at the same time. In either case, only change the node directory entry and not the database directory entry. Use the UNCATALOG NODE and CATALOG TCPIP NODE commands to re-catalog the nodes.
For HADR, check the values for HADR_LOCAL_HOST and/or HADR_REMOTE_HOST on both the primary and the standby database servers for any IP addresses. Hostnames in these variables mean that changes are not required if only the IP Address of a server is changing. Remember that making changes to these parameters on an earlier version than DB2 10.1 requires deactivation and activation of the database before the changes will actually take effect, and that means a database outage.
If you are using TSA/db2haicu to automate failover, it is important to understand if the IP addresses for the quoroum device or the Virtual IP are changing. If they are, update them through db2haicu. Be prepared to destroy your TSA domain and re-do the whole thing. Because of this, it is critical to have the inputs ready for doing that: https://datageek.blog/2012/04/09/using-tsadb2haicu-to-automate-failover-part-1-the-preparation/. Depending on what is changing, it may be possible to go in and make selective changes using db2haicu to add/remove a quoroum device or to add/remove a virtual IP address. In the worst case scenario, the network changes may require re-doing the setup from scratch.
Host Name Changes
This is actually pretty well documented by IBM: http://www-01.ibm.com/support/docview.wss?uid=swg21258834. IBM’s document also covers Windows systems, which I’m not covering in this post.
If the host name changes, change db2nodes.cfg. db2nodes.cfg is located in the instance owner’s home directory, in the sqllib subdirectory. Use a text editor to edit it with the updated host name.
The next thing to change is the value of the DB2SYSTEM DB2 registry variable. It is a variable that rarely requires changing, but if you do a db2set -all, one of the parameters that you will see set is called DB2SYSTEM. It looks like this:
$ db2set -all |grep g [g] DB2FCMCOMM=TCPIP4 [g] DB2SYSTEM=svq00db01z.domain.com [g] DB2INSTDEF=db2inst1 [g] DB2ADMINSERVER=dasusr1
Chances are that most DBAs have never even changed the [g] parameters. Root is required to change this. Either su to root or login as root, and then source the db2profile and use the db2set command like this:
# . /db2home/db2inst1/sqllib/db2profile # db2set -g DB2SYSTEM=svs00db01z.domain.com
Of course the path for the db2profile will vary. It is in the instance owner’s home directory under the sqllib sub directory.
Check the output of db2set again – it should be correct. Restart the instance(db2stop/db2start) for the change to take effect.
When using HADR, changes are required to the parameters for the host names – HADR_LOCAL_HOST and/or HADR_REMOTE_HOST on both the primary and the standby database servers. Remember that, unless using DB2 10.1 or later, DB2 requires deactivating and activating the database before the changes will actually take effect. That means a database outage.
For TSA/db2haicu, it is important to understand if the IP addresses for the quorum device or the Virtual IP are changing. If they are, update them, and be prepared to destroy your TSA domain and re-do the whole thing. Have all the inputs handy from https://datageek.blog/2012/04/09/using-tsadb2haicu-to-automate-failover-part-1-the-preparation/. If the database server names are changing, be prepared to re-do the setup from scratch. If you fully understand what TSA is doing under the covers, making the changes outside of db2haicu may be possible, but I don’t have instructions on that.
An incorrect hosts file can have some interesting effects, and of course it is most likely to be incorrect when something like the hostname or IP address changes and someone has to also change the hosts file. When I went to look at a colleague’s HADR standby after an IP address change recently, I got this error every time I tried to start DB2:
SQL6048N A communication error occurred during START or STOP DATABASE MANAGER processing.
This error was particularly perplexing given the fact that I was working with a single-node environment – it didn’t seem to me that any communication would be necessary. But I even got it when trying to list applications. In that case, I discovered that the /etc/hosts file had an entry for the local server, but it did not reference the full server name returned by the ‘hostname’ command. It did, however have the short name listed in db2nodes.cfg. I must have spent 10 minutes doing ‘hostname’, and looking at the db2nodes.cfg file before I figured this one out. It’s not just for TSA that the hosts file and the results of the ‘hostname’ command must be in sync with db2nodes.cfg. In this case, I edited /etc/hosts with the proper fully qualified host name, and the error then went away.
The same kind of confusion can occur after cloning a server at the OS level, when changing the hostname, or when changing the IP address.