I ran into this error message recently and thought I’d describe what the issue was.
The Scenario
I was building a new consolidated development database server for a client – moving away from 10 separate database servers to one (big) VM to run their development workloads. This was a Linux server, so I was setting INSTANCE_MEMORY to prevent any one instance from eating all the memory and to make it less likely that I would have a Db2 instance crash due to a Linux Out Of Memory condition killing the biggest memory consumer.
The Problem
I was restoring each database in using a redirected restore the place storage where I wanted it. The backups I was working from were all online backups. On the 8th database I restored, I got the following error on roll forward after the restore.
$ db2 "rollforward db SAMPLE to end of backup and stop overflow log path ('/db2/backups/SAMPLE')" SQL1643C The database manager failed to allocate shared memory because the database manager instance memory limit has been reached.
It took me a while to track down where the issue was. I tried reducing the size of the largest bufferpool (though I didn’t think that would help, since DB2 would start up with the hidden buffer pools if my buffer pools were too large). It should have gone with the hidden buffer pools if any of the memory areas were over-sized. What I did was to set INSTANCE_MEMORY to AUTOMATIC for this one instance to get the roll forward to complete so I could look at the database memory that Db2 was allocating.
After I restarted the instance, I could get the roll forward to complete and then get details on what Db2 was allocating:
$ db2pd -dbptnmem Database Member 0 -- Active -- Up 0 days 00:06:21 -- Date 2017-12-06-22.21.11.488636 Database Member Memory Controller Statistics Controller Automatic: Y Memory Limit: 59181724 KB Current usage: 6302464 KB HWM usage: 6302464 KB Cached memory: 166848 KB Individual Memory Consumers: Name Mem Used (KB) HWM Used (KB) Cached (KB) ======================================================== APPL-SAMPLE 160000 160000 160000 DBMS-b2cnadv1 113024 113024 6848 FMP_RESOURCES 22528 22528 0 PRIVATE 6528 6528 0 DB-SAMPLE 6000000 6000000 0 LCL-p6116 128 128 0 LCL-p5530 128 128 0 LCL-p5582 128 128 0 $ db2pd -d SAMPLE -mempools Database Member 0 -- Database SAMPLE -- Active -- Up 0 days 00:05:42 -- Date 2017-12-06-22.21.26.384984 Memory Pools: Address MemSet PoolName Id SecondId Overhead LogSz LogHWM PhySz PhyHWM CfgSize Bnd BlkCnt CfgParm 0x00007F9828009578 SAMPLE bph 16 2 0 180959744 180959744 181534720 181534720 181534720 No 3488 n/a 0x00007F98280037A8 SAMPLE bph 16 1 0 3478197376 3478197376 3480485888 3480485888 3480485888 No 24894 n/a 0x00007F9827FFD9D8 SAMPLE bph 16 4099 0 784768 784768 1835008 1835008 1835008 No 5 n/a 0x00007F9827FF7C08 SAMPLE bph 16 4098 0 522624 522624 1572864 1572864 1572864 No 3 n/a 0x00007F9827FF1E38 SAMPLE bph 16 4097 0 391552 391552 1441792 1441792 1441792 No 2 n/a 0x00007F9827FEC068 SAMPLE bph 16 4096 0 326016 326016 1376256 1376256 1376256 No 2 n/a 0x00007F9827FE6298 SAMPLE shsorth 18 0 0 0 0 0 0 66191360 No 0 SHEAPTHRES_SHR 0x00007F9827FE33B0 SAMPLE lockh 4 0 0 56110208 56110208 56164352 56164352 56229888 No 1 LOCKLIST 0x00007F9827FE04C8 SAMPLE dbh 2 0 167904 34377208 34377208 35454976 35454976 145096704 No 850 DBHEAP 0x00007F99963833B0 Appl apph 1 8 0 0 0 0 0 1048576 No 0 APPLHEAPSZ 0x00007F99963804C8 Appl appshrh 20 0 0 0 0 0 0 81920000 No 0 application shared
What immediately struck me in this output is that the database memory DB-SAMPLE
is an awfully round number, and on careful analysis way higher than even the high water marks add up to.
When I checked the setting for DATABASE_MEMORY, I saw:
$ db2 get db cfg for SAMPLE |grep DATABASE_MEMORY Size of database shared memory (4KB) (DATABASE_MEMORY) = 1500000
Indeed, someone had set DATABASE_MEMORY to a static value. I guess this was surprising to me because these databases had not had support from a DBA, and the first 8 similar databases did not have DATABASE_MEMORY set.
The Solution
The solution here was simple – I updated DATABASE_MEMORY to be automatic:
$ db2 update db cfg for WCBSTGQA using DATABASE_MEMORY AUTOMATIC DB20000I The UPDATE DATABASE CONFIGURATION command completed successfully.
I also changed INSTANCE_MEMORY back to my original limiting value.
Indeed a good approach: Fixing INSTANCE_MEMORY ‘s (making sure they fit in servers memory to prevent Linux OOM ) and not DATABASE_MEMORY ‘s.
Here we stopped setting DATABASE_MEMORY to whatever static value and set it to AUTOMATIC. Only INSTANCE_MEMORY is static (and well chosen).
Some years ago on Db2 v9.7.x , I had this scary INCIDENT , 1x instance / 1x database :
A FATAL db crash recovery, after a ‘simple’ server-power failure (luckily not on a production server), just because the DATABASE_MEMORY was fixed; the crash recovery somehow needed more memory ….
I could increase the INSTANCE_MEMORY (plenty real memory left on server), and recycle instance, but still the crash db recovery failed , complaining about “not enough database memory” or so … Bummer.