Using TSA/db2haicu to Automate Failover Part 3: Testing, Ways Setup can go Wrong and What to do.

Posted by

Part 3 in this series is a bit overdue. Parts 1 and 2 were back in April. This is a complicated topic. Please use any procedures here with extreme care, and keep in mind that if you have anything other than the standard two-server HADR-only TSA implementation, these procedures probably aren’t the best idea, as they could break other things. There will also be a Part 4 – dealing with problems after set-up.

I’m not saying I’m covering every possible failure scenario, but I’ve seen a number of different issues and wanted to share some strategies for dealing with them.

Testing automated failover

First of all, it is absoultely critical that you test your failover. As many tests as you can manage will help you out here. I try to set up hadr, set up failover using TSA/db2haicu, and test all in the same week to keep things from getting missed.
The absolute minimum tests you should do are:

  • Manual takeover, verifying the database
  • Manual takeover, verifying the Commerce (or other) application
  • Hard failure with inability to start (renaming executable)

If at all possible, also do the following tests:

  • Power-off tests on each node
  • db2_kill test on each node (with caution)
  • Manual takeover by force on each node
  • Network failure test on each node
  • Failover under load (during load test)

See section 6 of this document for some really excellent details on testing: http://download.boulder.ibm.com/ibmdl/pub/software/dw/data/dm-0908hadrdb2haicu/HADR_db2haicu.pdf

If you just assume it will work, it probably will not. On at least three occasions, I’ve caught issues while testing failover.

Failure causes

While I’ve caught 3 issues while testing failover, I’ve had at least twice that many during the setup process. The most common cause of failure that I’ve seen is missed steps during preparation. For nearly every problem or issue I’ve seen, I’ve gone back and added to that preparation post. The first few times I set up TSA with HADR, my preparation was mostly just gathering inputs. Then, one by one, as I saw failures, I added to the prep work. I’m still going to talk about what those missed prep work errors look like, because it’s easy to miss something. I always say that the best DBA is a detail-oriented control freak, and this is one area where that’s certainly true. If you’re having problems, literally go through the preparation post line by line on each server and see if you missed anything. Seriously, for any failure prior to testing, go through the preparation items with a fine-tooth comb on both servers.

preprpnode Failure

If you go to do the preprpnode preparation step, and you get a failure like this:

# preprpnode server1.domain.com server2.domain.com

-bash: preprpnode: command not found

This likely means that your SAM installation was not completed successfully. See https://datageek.blog/2012/04/09/using-tsadb2haicu-to-automate-failover-part-1-the-preparation/ – the section called “Software Installed” – for details on how to do that.

Failure on Creating the Domain

What this looks like

> db2haicu
...
Create the domain now? [1]
1. Yes
2. No
1
Creating domain prod_db2ha in the cluster ...
Creating domain failed. Refer to db2diag.log and the DB2 Information Center for details.

I don’t have excerpts from the db2diag log at this point – if anyone does, please share.

Resolution

This usually means you didn’t do the preprpnode or you didn’t do it properly. Remember that the preprpnode must be done as root on both servers in this format:

# preprpnode server1.domain.com server2.domain.com

db2haicu Fails Near the End of the Setup for the Standby Server

What This Looks Like

> db2haicu
...
Retrieving high availability configuration parameter for instance db2inst1 ...
The cluster manager name configuration parameter (high availability configuration parameter) is not set. For more information, se
e the topic "cluster_mgr - Cluster manager name configuration parameter" in the DB2 Information Center. Do you want to set the hi
gh availability configuration parameter?
The following are valid settings for the high availability configuration parameter:
  1.TSA
  2.Vendor
Enter a value for the high availability configuration parameter: [1]
1
Setting a high availability configuration parameter for instance db2inst1 to TSA.
Adding DB2 database partition 0 to the cluster ...
There was an error with one of the issued cluster manager commands. Refer to db2diag.log and the DB2 Information Center for detai
ls.

Resolution

In the case where I most recently saw this particular failure, I had set up HADR with IP addresses. TSA/db2haicu does not seem to like or allow the use of just IP addresses. So I had to go back and re-do the HADR setup using host names. I believe I’ve also seen failures here due to incorrect formatting of the hosts file or incorrect entries in db2nodes.cfg(yes, for single node implementations). Basically a failure at this point most frequently means that you missed some part of the preparation steps. See https://datageek.blog/2012/04/09/using-tsadb2haicu-to-automate-failover-part-1-the-preparation/.

Failure on failover test while testing the Application

This one seems a bit dumb in retrospect, but I was working with someone I don’t normally, and made some assumptions that I shouldn’t have. Essentially what happened was that when we tested the failover, we saw the database come up fine every time, but the application never seemed to re-establish connections. After a couple of hours of troubleshooting, we realized that the application’s ID did not exist on the standby server, and when it was created and the passwords synced, the problem immediately went away. This holds true for just standard HADR, even if you’re not using TSA: ensure that your user ids and passwords are identical between your primary and your standby database servers.

TSA Installation Issues

We normally install DB2 from Base Code, and then Apply the latest FixPack (well, as long as it has been out for a month or so). On RedHat, we’ve seen issues where the version of RedHat we’re using doesn’t support the version of TSA that comes with the base code. So when we install DB2, it gives an error message that the TSA/SAM component could not be installed. Luckily the version of TSA that comes with FixPack 4 and later is supported with the version of RedHat. But the FixPack does not automatically install it, of course. So for servers where we want to use TSA, we have to install the DB2 Base Code, Install the FixPack, and then install the TSA/SAM component from the FixPack code using this procedure: https://datageek.blog/2012/04/09/using-tsadb2haicu-to-automate-failover-part-1-the-preparation/ – the section called “Software Installed”

Other Failures

Ultimately, I know that I don’t fully understand at least half of the failures I’ve seen. I need to see what information I can find on pure TSA so that I really understand what to do and all of the states. I would love it if there were some education offered for this at the conference or even just in a webcast. So what I really have are a series of things that I try when a failure occurs. Some I’ve already mentioned above.

  1. Go through the prep work with a fine tooth comb: https://datageek.blog/2012/04/09/using-tsadb2haicu-to-automate-failover-part-1-the-preparation/. This includes:
    • Double and tripple check that you have picked either the server’s short name or the server’s long name and are using it consistently in each of:
      • /etc/hosts
      • HADR configuration parameters in db cfg
      • db2nodes.cfg (in $HOME/sqllib)
      • Results of the ‘hostname’ command
    • Double check that you successfully executed the preprpnode command on both hosts
    • Double check that you successfully executed the db2cptsa command on both hosts
  2. Start Over. Delete your TSA work using the -delete option on db2haicu and start over with db2haicu fresh
    [db2inst1@403238-Prod-db2 ~]$ db2haicu -delete
    Welcome to the DB2 High Availability Instance Configuration Utility (db2haicu).
    
    You can find detailed diagnostic information in the DB2 server diagnostic log file called db2diag.log. Also, you can use the util
    ity called db2pd to query the status of the cluster domains you create.
    
    For more information about configuring your clustered environment using db2haicu, see the topic called 'DB2 High Availability Ins
    tance Configuration Utility (db2haicu)' in the DB2 Information Center.
    
    db2haicu determined the current DB2 database manager instance is db2inst1. The cluster configuration that follows will apply to t
    his instance.
    
    When you use db2haicu to configure your clustered environment, you create cluster domains. For more information, see the topic 'C
    reating a cluster domain with db2haicu' in the DB2 Information Center. db2haicu is searching the current machine for an existing
    active cluster domain ...
    db2haicu found a cluster domain called prod_db2ha on this machine. The cluster configuration that follows will apply to this doma
    in.
    
    Deleting the domain prod_db2ha from the cluster ...
    Deleting the domain prod_db2ha from the cluster was successful.
    All cluster configurations have been completed successfully. db2haicu exiting ...
    
  3. Try uninstalling and re-installing the TSA/SAM component
    • Uninstalling looks like this:
      [root@server1]# cd /db2/linuxamd64/tsamp
      [root@server1]# ./uninstallSAM
      uninstallSAM: Uninstalling System Automation on platform: x86_64
      uninstallSAM: Package is not installed: sam.sappolicy
      uninstallSAM: Uninstalling
       sam.adapter-3.1.0.1-08261.i386
      uninstallSAM: Uninstalling
       sam.msg.de_DE-3.1.0.0-0.i386
       sam.msg.de_DE.ISO-8859-1-3.1.0.0-0.i386
       sam.msg.de_DE@euro-3.1.0.0-0.i386
       sam.msg.de_DE.UTF-8-3.1.0.0-0.i386
      uninstallSAM: Uninstalling
       sam.msg.es_ES-3.1.0.0-0.i386
       sam.msg.es_ES.ISO-8859-1-3.1.0.0-0.i386
       sam.msg.es_ES@euro-3.1.0.0-0.i386
       sam.msg.es_ES.UTF-8-3.1.0.0-0.i386
      uninstallSAM: Uninstalling
       sam.msg.fr_FR-3.1.0.0-0.i386
       sam.msg.fr_FR.ISO-8859-1-3.1.0.0-0.i386
       sam.msg.fr_FR@euro-3.1.0.0-0.i386
       sam.msg.fr_FR.UTF-8-3.1.0.0-0.i386
      uninstallSAM: Uninstalling
       sam.msg.it_IT-3.1.0.0-0.i386
       sam.msg.it_IT.ISO-8859-1-3.1.0.0-0.i386
       sam.msg.it_IT@euro-3.1.0.0-0.i386
       sam.msg.it_IT.UTF-8-3.1.0.0-0.i386
      uninstallSAM: Uninstalling
       sam.msg.ja_JP.eucJP-3.1.0.0-0.i386
       sam.msg.ja_JP.UTF-8-3.1.0.0-0.i386
      uninstallSAM: Uninstalling
       sam.msg.ko_KR.eucKR-3.1.0.0-0.i386
       sam.msg.ko_KR.UTF-8-3.1.0.0-0.i386
      uninstallSAM: Uninstalling
       sam.msg.pt_BR-3.1.0.0-0.i386
       sam.msg.pt_BR.UTF-8-3.1.0.0-0.i386
      uninstallSAM: Uninstalling
       sam.msg.zh_CN.GB2312-3.1.0.0-0.i386
       sam.msg.zh_CN.GB18030-3.1.0.0-0.i386
       sam.msg.zh_CN.GBK-3.1.0.0-0.i386
       sam.msg.zh_CN.UTF-8-3.1.0.0-0.i386
      uninstallSAM: Uninstalling
       sam.msg.zh_TW-3.1.0.0-0.i386
       sam.msg.zh_TW.Big5-3.1.0.0-0.i386
       sam.msg.zh_TW.eucTW-3.1.0.0-0.i386
       sam.msg.zh_TW.UTF-8-3.1.0.0-0.i386
      uninstallSAM: Uninstalling
       sam-3.1.0.1-08261.i386
      uninstallSAM: Uninstalling
       rsct.opt.storagerm-2.5.1.4-08249.i386
      uninstallSAM: Uninstalling
       rsct.64bit-2.5.1.4-08249.x86_64
      uninstallSAM: Uninstalling
       rsct.basic.msg.de_DE-2.5.1.2-0.i386
       rsct.basic.msg.de_DE.ISO-8859-1-2.5.1.2-0.i386
       rsct.basic.msg.de_DE@euro-2.5.1.2-0.i386
       rsct.basic.msg.de_DE.UTF-8-2.5.1.2-0.i386
      uninstallSAM: Uninstalling
       rsct.basic.msg.es_ES-2.5.1.2-0.i386
       rsct.basic.msg.es_ES.ISO-8859-1-2.5.1.2-0.i386
       rsct.basic.msg.es_ES@euro-2.5.1.2-0.i386
       rsct.basic.msg.es_ES.UTF-8-2.5.1.2-0.i386
      uninstallSAM: Uninstalling
       rsct.basic.msg.fr_FR-2.5.1.2-0.i386
       rsct.basic.msg.fr_FR.ISO-8859-1-2.5.1.2-0.i386
       rsct.basic.msg.fr_FR@euro-2.5.1.2-0.i386
       rsct.basic.msg.fr_FR.UTF-8-2.5.1.2-0.i386
      uninstallSAM: Uninstalling
       rsct.basic.msg.it_IT-2.5.1.2-0.i386
       rsct.basic.msg.it_IT.ISO-8859-1-2.5.1.2-0.i386
       rsct.basic.msg.it_IT@euro-2.5.1.2-0.i386
       rsct.basic.msg.it_IT.UTF-8-2.5.1.2-0.i386
      uninstallSAM: Uninstalling
       rsct.basic.msg.ja_JP.eucJP-2.5.1.2-0.i386
       rsct.basic.msg.ja_JP.UTF-8-2.5.1.2-0.i386
      uninstallSAM: Uninstalling
       rsct.basic.msg.ko_KR.eucKR-2.5.1.2-0.i386
       rsct.basic.msg.ko_KR.UTF-8-2.5.1.2-0.i386
      uninstallSAM: Uninstalling
       rsct.basic.msg.pt_BR-2.5.1.2-0.i386
       rsct.basic.msg.pt_BR.UTF-8-2.5.1.2-0.i386
      uninstallSAM: Uninstalling
       rsct.basic.msg.zh_CN.GB2312-2.5.1.2-0.i386
       rsct.basic.msg.zh_CN.GB18030-2.5.1.2-0.i386
       rsct.basic.msg.zh_CN.GBK-2.5.1.2-0.i386
       rsct.basic.msg.zh_CN.UTF-8-2.5.1.2-0.i386
      uninstallSAM: Uninstalling
       rsct.basic.msg.zh_TW-2.5.1.2-0.i386
       rsct.basic.msg.zh_TW.Big5-2.5.1.2-0.i386
       rsct.basic.msg.zh_TW.eucTW-2.5.1.2-0.i386
       rsct.basic.msg.zh_TW.UTF-8-2.5.1.2-0.i386
      uninstallSAM: Uninstalling
       rsct.basic-2.5.1.4-08249.i386
      uninstallSAM: Uninstalling
       rsct.core.msg.de_DE-2.5.1.2-0.i386
       rsct.core.msg.de_DE.ISO-8859-1-2.5.1.2-0.i386
       rsct.core.msg.de_DE@euro-2.5.1.2-0.i386
       rsct.core.msg.de_DE.UTF-8-2.5.1.2-0.i386
      uninstallSAM: Uninstalling
       rsct.core.msg.es_ES-2.5.1.2-0.i386
       rsct.core.msg.es_ES.ISO-8859-1-2.5.1.2-0.i386
       rsct.core.msg.es_ES@euro-2.5.1.2-0.i386
       rsct.core.msg.es_ES.UTF-8-2.5.1.2-0.i386
      uninstallSAM: Uninstalling
       rsct.core.msg.fr_FR-2.5.1.2-0.i386
       rsct.core.msg.fr_FR.ISO-8859-1-2.5.1.2-0.i386
       rsct.core.msg.fr_FR@euro-2.5.1.2-0.i386
       rsct.core.msg.fr_FR.UTF-8-2.5.1.2-0.i386
      uninstallSAM: Uninstalling
       rsct.core.msg.it_IT-2.5.1.2-0.i386
       rsct.core.msg.it_IT.ISO-8859-1-2.5.1.2-0.i386
       rsct.core.msg.it_IT@euro-2.5.1.2-0.i386
       rsct.core.msg.it_IT.UTF-8-2.5.1.2-0.i386
      uninstallSAM: Uninstalling
       rsct.core.msg.ja_JP.eucJP-2.5.1.2-0.i386
       rsct.core.msg.ja_JP.UTF-8-2.5.1.2-0.i386
      uninstallSAM: Uninstalling
       rsct.core.msg.ko_KR.eucKR-2.5.1.2-0.i386
       rsct.core.msg.ko_KR.UTF-8-2.5.1.2-0.i386
      uninstallSAM: Uninstalling
       rsct.core.msg.pt_BR-2.5.1.2-0.i386
       rsct.core.msg.pt_BR.UTF-8-2.5.1.2-0.i386
      uninstallSAM: Uninstalling
       rsct.core.msg.zh_CN.GB2312-2.5.1.2-0.i386
       rsct.core.msg.zh_CN.GB18030-2.5.1.2-0.i386
       rsct.core.msg.zh_CN.GBK-2.5.1.2-0.i386
       rsct.core.msg.zh_CN.UTF-8-2.5.1.2-0.i386
      uninstallSAM: Uninstalling
       rsct.core.msg.zh_TW-2.5.1.2-0.i386
       rsct.core.msg.zh_TW.Big5-2.5.1.2-0.i386
       rsct.core.msg.zh_TW.eucTW-2.5.1.2-0.i386
       rsct.core.msg.zh_TW.UTF-8-2.5.1.2-0.i386
      uninstallSAM: Uninstalling
       rsct.core-2.5.1.4-08249.i386
      uninstallSAM: Uninstalling
       rsct.core.utils.msg.de_DE-2.5.1.2-0.i386
       rsct.core.utils.msg.de_DE.ISO-8859-1-2.5.1.2-0.i386
       rsct.core.utils.msg.de_DE@euro-2.5.1.2-0.i386
       rsct.core.utils.msg.de_DE.UTF-8-2.5.1.2-0.i386
      uninstallSAM: Uninstalling
       rsct.core.utils.msg.es_ES-2.5.1.2-0.i386
       rsct.core.utils.msg.es_ES.ISO-8859-1-2.5.1.2-0.i386
       rsct.core.utils.msg.es_ES@euro-2.5.1.2-0.i386
       rsct.core.utils.msg.es_ES.UTF-8-2.5.1.2-0.i386
      uninstallSAM: Uninstalling
       rsct.core.utils.msg.fr_FR-2.5.1.2-0.i386
       rsct.core.utils.msg.fr_FR.ISO-8859-1-2.5.1.2-0.i386
       rsct.core.utils.msg.fr_FR@euro-2.5.1.2-0.i386
       rsct.core.utils.msg.fr_FR.UTF-8-2.5.1.2-0.i386
      uninstallSAM: Uninstalling
       rsct.core.utils.msg.it_IT-2.5.1.2-0.i386
       rsct.core.utils.msg.it_IT.ISO-8859-1-2.5.1.2-0.i386
       rsct.core.utils.msg.it_IT@euro-2.5.1.2-0.i386
       rsct.core.utils.msg.it_IT.UTF-8-2.5.1.2-0.i386
      uninstallSAM: Uninstalling
       rsct.core.utils.msg.ja_JP.eucJP-2.5.1.2-0.i386
       rsct.core.utils.msg.ja_JP.UTF-8-2.5.1.2-0.i386
      uninstallSAM: Uninstalling
       rsct.core.utils.msg.ko_KR.eucKR-2.5.1.2-0.i386
       rsct.core.utils.msg.ko_KR.UTF-8-2.5.1.2-0.i386
      uninstallSAM: Uninstalling
       rsct.core.utils.msg.pt_BR-2.5.1.2-0.i386
       rsct.core.utils.msg.pt_BR.UTF-8-2.5.1.2-0.i386
      uninstallSAM: Uninstalling
       rsct.core.utils.msg.zh_CN.GB2312-2.5.1.2-0.i386
       rsct.core.utils.msg.zh_CN.GB18030-2.5.1.2-0.i386
       rsct.core.utils.msg.zh_CN.GBK-2.5.1.2-0.i386
       rsct.core.utils.msg.zh_CN.UTF-8-2.5.1.2-0.i386
      uninstallSAM: Uninstalling
       rsct.core.utils.msg.zh_TW-2.5.1.2-0.i386
       rsct.core.utils.msg.zh_TW.Big5-2.5.1.2-0.i386
       rsct.core.utils.msg.zh_TW.eucTW-2.5.1.2-0.i386
       rsct.core.utils.msg.zh_TW.UTF-8-2.5.1.2-0.i386
      uninstallSAM: Uninstalling
       rsct.core.utils-2.5.1.4-08249.i386
      uninstallSAM: Uninstalling
       src.msg.de_DE-1.3.0.3-0.i386
       src.msg.de_DE.ISO-8859-1-1.3.0.3-0.i386
       src.msg.de_DE@euro-1.3.0.3-0.i386
       src.msg.de_DE.UTF-8-1.3.0.3-0.i386
      uninstallSAM: Uninstalling
       src.msg.es_ES-1.3.0.3-0.i386
       src.msg.es_ES.ISO-8859-1-1.3.0.3-0.i386
       src.msg.es_ES@euro-1.3.0.3-0.i386
       src.msg.es_ES.UTF-8-1.3.0.3-0.i386
      uninstallSAM: Uninstalling
       src.msg.fr_FR-1.3.0.3-0.i386
       src.msg.fr_FR.ISO-8859-1-1.3.0.3-0.i386
       src.msg.fr_FR@euro-1.3.0.3-0.i386
       src.msg.fr_FR.UTF-8-1.3.0.3-0.i386
      uninstallSAM: Uninstalling
       src.msg.it_IT-1.3.0.3-0.i386
       src.msg.it_IT.ISO-8859-1-1.3.0.3-0.i386
       src.msg.it_IT@euro-1.3.0.3-0.i386
       src.msg.it_IT.UTF-8-1.3.0.3-0.i386
      uninstallSAM: Uninstalling
       src.msg.ja_JP.eucJP-1.3.0.3-0.i386
       src.msg.ja_JP.UTF-8-1.3.0.3-0.i386
      uninstallSAM: Uninstalling
       src.msg.ko_KR.eucKR-1.3.0.3-0.i386
       src.msg.ko_KR.UTF-8-1.3.0.3-0.i386
      uninstallSAM: Uninstalling
       src.msg.pt_BR-1.3.0.3-0.i386
       src.msg.pt_BR.UTF-8-1.3.0.3-0.i386
      uninstallSAM: Uninstalling
       src.msg.zh_CN.GB2312-1.3.0.3-0.i386
       src.msg.zh_CN.GB18030-1.3.0.3-0.i386
       src.msg.zh_CN.GBK-1.3.0.3-0.i386
       src.msg.zh_CN.UTF-8-1.3.0.3-0.i386
      uninstallSAM: Uninstalling
       src.msg.zh_TW-1.3.0.3-0.i386
       src.msg.zh_TW.Big5-1.3.0.3-0.i386
       src.msg.zh_TW.eucTW-1.3.0.3-0.i386
       src.msg.zh_TW.UTF-8-1.3.0.3-0.i386
      uninstallSAM: Uninstalling
       src-1.3.0.4-08249.i386
    • For re-installing see: https://datageek.blog/2012/04/09/using-tsadb2haicu-to-automate-failover-part-1-the-preparation/ – the section called “Software Installed”

Now that I have my prep work figured out, I can get a clean setup on the first try about 50-75% of the time. The rest of the time, I still have some sort of issue that I have to troubleshoot and deal with on setup or testing. So don’t be discouraged – just work through the issues. I hope this post can provide you with a good toolbox of things to try. Please comment or contact me if you have additional issues that you have seen and solved so others can benefit from your pain.

Other Posts In This Series

This series consists of four posts:
Using TSA/db2haicu to automate failover – Part 1: The Preparation
Using TSA/db2haicu to automate failover – Part 2: How it looks if it goes smoothly
Using TSA/db2haicu to Automate Failover Part 3: Testing, Ways Setup can go Wrong and What to do.
“Using TSA/db2haicu to automate failover Part 4: Dealing with Problems After Setup

Search this blog on “TSA” for other posts on TSA issues and tips.

Ember is always curious and thrives on change. She has built internationally recognized expertise in IBM Db2, and is now pivoting to focus on learning MySQL. Ember shares both posts about her core skill set and her journey learning MySQL. Ember lives in Denver and work from home

14 comments

  1. In the failover testing by killing the instance and then setting the resource back, i am getting the below error.
    cope81:db2eq 180> sudo resetrsrc -s “Name =’db2_db2eq_cope81_0-rs’ AND NodeNameList = {‘cope81’}” IBM.Application
    [sudo] password for db2eq:
    2610-426 The specified resource or resource class is not currently available on node cope81.
    Tried setting up CT_MANAGEMENT_SCOPE, but still doesn’t work.

  2. I have an example of domain creation failure. It is a common error if the server is cloned:

    2017-09-05-15.29.52.874822+000 E6688133E546 LEVEL: Error
    PID : 4927 TID : 140238078834464 PROC : db2havend (db2ha)
    INSTANCE: i11 NODE : 000
    HOSTNAME: s1
    FUNCTION: DB2 UDB, high avail services, db2haCreateDomain, probe:18361
    DATA #1 : String, 248 bytes
    2632-044 The domain cannot be created due to the following errors that were detected while harvesting information from the target nodes:
    s1: 2632-068 This node has the same internal identifier as s0 and cannot be included in the domain definition.

    2017-09-05-15.29.52.875108+000 E6688680E430 LEVEL: Severe
    PID : 4846 TID : 140139498264352 PROC : db2haicu
    INSTANCE: i11 NODE : 000
    HOSTNAME: s1
    FUNCTION: DB2 UDB, high avail services, sqlhaCreateDomain2, probe:600
    MESSAGE : ECF=0x90000544=-1879046844=ECF_SQLHA_CREATE_CLUSTER_FAILED
    Create cluster failed
    DATA #1 : String, 40 bytes
    Error received from Vendor Function Call

  3. Hi Ember, tried to create TSA set, but got stuck while creating domain in db2haicu.. got below error:

    2632-044 The domain cannot be created due to the following errors that were detected while harvesting information from the target nodes:
    mars: 2610-652 The specified time limit has been exceeded.

    How can I resolve this error, pls help.

    1. Did you run the preprpnode first? Whenever I get an error, I always go back and go through the prep work all over again and start over.

        1. Then I would start over and go through all the prep work again and try again. Annoyingly, TSAMP is something where sometimes you do the same thing a second time and it actually works when it failed the first time.

  4. Hi Ember, did you try to setup TSA auto failover for a HADR cluster consisting of 2 nodes from separate datacenters, where each node has 2 public IPs, instead of 1 public and 1 private.

    Where Public IP #1 on en0 is from a public network which is exclusive to each datacenter.
    This IP is mapped to the hostname and is entered in the hostname record in /etc/hosts file.

    Whereas public IP#2 on en1 is on a common network between the datacenters, so the VIP for each hadr database need to be from this network, but the hostname is not mapped to this IP.
    what are the difficulties you see in the above setup to configure TSA and are there any ways around.

    1. I have not set it up this way. Every time I’ve set it up, the IPs of the servers are on the same subnet. TSAMP is for HA and not for DR failovers. HA and DR have different requirements, and we generally don’t try to satisfy both with only two servers in an HADR pair – it’s usually 3 or more with, HA between two using TSA and DR with a different standby, not using TSA.

  5. I’m getting the error The IP address xx.xxx.xxx.xxx cannot be added to the cluster because the IP address cannot live on the network db2_public_network_0.

    please let us know solution

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.