Edited 12/13/2017 to correct image and details about the database heap.
Db2 memory management is a complex topic that I’ve already written about from several perspectives, and have a few blog entries in mind to explore further. A presentation that Melanie Stopfer gave a while back really helped me understand STMM better. I thought I would share what I learned.
What STMM Does
The Self-Tuning Memory Manger is a process that wakes up from time to time and reallocates how much memory several critical memory areas have allocated. It reallocates memory in specific ways based on the recent usage on the Db2 system
What STMM Does Not Do
STMM does not predict future or regular workloads and proactively allocate memory. It does not prevent all possible out of memory conditions. It is a bit better for non-BLU systems, and has some restrictions on multi-partition systems. It is not a magic wand that will always do better than a DBA. STMM does not allow you to prioritize certain workloads over others.
Areas Being Tuned
STMM can only adjust specific memory areas, and only if they’re set to AUTOMATIC. The areas that STMM can impact (increase or decrease) in database memory (DATABASE_MEMORY) are in blue in this image:
Written out, those memory areas are:
- The shared sort heap (SHEAPTHRESH_SHR) – if set to AUTOMATIC AND if the database is not using BLU
- The package cache (PCKCACHESZ) – if set to AUTOMATIC, though I frequently set this to a static size
- The lock list (LOCKLIST) – if set to AUTOMATIC
- Any buffer pools with a size set to AUTOMATIC, though you’ll often need to update the size to keep an HADR standby up to similar sizes
Though the catalog cache and the utility heap are part of DATABASE_MEMORY, they cannot be tuned by STMM. The Database Heap is also not tuned by STMM, though it can be set to AUTOMATIC to be automatically increased as needed.
In addition to these, STMM can also allocate additional memory to DATABASE_MEMORY itself if DATABASE_MEMORY is set to AUTOMATIC and one of the following is true:
- INSTANCE_MEMORY is AUTOMATIC and there is memory available at the OS level
- INSTANCE_MEMORY is set to a fixed value and there is memory available in INSTANCE_MEMORY
- DATABASE_MEMORY for another database in the instance is set to AUTOMATIC and can be reduced
You can remove any memory area from STMM’s control by setting it to a static value.
The database overflow buffer is a memory area you cannot directly configure the size of. This is an additional area of memory that Db2 allocates on start up to allow space for overflows of memory areas and online increases of memory areas. Without this area, you wouldn’t be able to change heap or buffer pool sizes online. If you’ve ever had an online increase of a buffer pool fail, it is likely because there wasn’t enough memory in your database overflow buffer.
It is important to be aware that not every memory area you can set to
AUTOMATIC is tuned by STMM. See my blog entry on When AUTOMATIC is not STMM.
Process STMM Follows
While this diagram is a simplification, it’s a good general representation:
Each time the Self-Tuning Memory Manger wakes up, it figures out if any memory areas need more space. If an area does need more space, STMM will first see if memory is available at the OS-level (if INSTANCE_MEMORY is AUTOMATIC) or within INSTANCE_MEMORY (if INSTANCE_MEMORY is set to a static value). If memory is not available at that level, STMM does some complicated math to determine if the current workload would benefit from stealing memory from another heap and giving it to the heap that needs memory. If STMM’s analysis shows this will be beneficial, it will reallocate memory.
Each heap that STMM controls can only grow by 50% or shrink by 20% each cycle, reducing the risk of really mercurial changes.
One of the great advantages of STMM is that it can be watching a database 24/7 and adjusting these major memory heaps to meet the changing needs of varying workloads. Most DBAs cannot give constant 24/7 attention to each and every database that they support. This means that STMM can change allocations to meet daytime heavy transactional workload and nighttime reporting or loading workload, more or less. It cannot predict these workloads.
There are definitely several problems with STMM. The first is that it cannot tune to prioritize one kind of workload over another, so if you have a database serving mixed workloads, and you want it better tuned to handle certain types of work, STMM may not be for you.
STMM also makes some bad decisions sometimes – like any DBA might. STMM tends to make the package cache a bit large at times, especially if parameter markers are not appropriately used in the workload. This memory may be better allocated to buffer pools or sort. This is particularly true in a database with a very non-repetitive workload.
INSTANCE_MEMORY should never be set to AUTOMATIC on a Linux server with more than one Db2 instance. If it is, each instance will probably try to allocate 90% or so of the memory. When two or more instances do this, it often leads to an out of memory (OOM) condition. Linux will generally find the largest user of memory and kill the process to keep the server from crashing. The largest user of memory is often one of the Db2 instances. Thus not setting a static value for INSTANCE_MEMORY can lead to a crash on a Linux server with more than one Db2 instance. I haven’t personally seen this happen on AIX, but would tend to apply the same approach on other platforms.
STMM cannot tune sort memory on BLU databases. It also cannot tune the utility heap, which is often very large in BLU databases. This means that STMM is a bit less useful for BLU. You could still use it to tune the remaining memory areas, which could still be 50% of your memory.
In DPF environments, STMM runs on one partition, and the configurations are then replicated to the other partitions. For pureScale, STMM independently tunes the memory areas on each member.
If you are using STMM, don’t set INSTANCE_MEMORY to automatic on a system with multiple Db2 instances – you risk Db2 crashes if you do.
Consider whether you really want your database memory to be able to be dynamically changed to match whatever workload is running when STMM wakes up.
Finally, just because STMM is managing your buffer pools and other memory areas does not mean you can ignore key performance indicators(KPIs) like the buffer pool hit ratios. From what I’ve seen, declining buffer pool hit ratios are one area where a memory bottleneck shows up when using STMM. Your action on seeing lower buffer pool hit ratios when using STMM may be different, but action is still required.
Overall, I use STMM for most of the environments I support. It is really excellent for single-database, single-instance, single-server environments. I have one BLU database where I’ve disabled STMM entirely because of some issues we had with it.