DB2 Basics: What is a Reorg?

Posted by

This is the first in a series of blog entries on reorgs. I was talking with a friend recently, and he pointed out that I only have a few articles on reorgs, and they’re very specific to complicated parts of reorg. I have scripted my own reorgs and made a point of educating myself on reorg, so I’m starting with several basic articles and may add in some more advanced articles.

Right now I plan to start with this article – What is a Reorg? – and do additional entries on how to determine if a reorg is needed and the different options when running a reorg. This series will also heavily cover reorgchk, and I may add another article in on the SQL that can be used to calculate reorgchk formulas.

Why Reorgs are Important

Reorgs are one of the keys to maintaining performance on a DB2 LUW database. One of the items on my list for the most basic regular tasks that DBAs need to perform is to run reorgs on the tables where reorgchk says they are needed on a regular basis. My databases have appropriate reorgs done on them weekly, though monthly may be frequently enough.

Over time, the regular activity in a database can cause minor issues that reorg can solve. If reorgs are not done in a timely manner, those minor issues can cause performance-killing and space-eating issues.

General Facts About Reorgs

A reorg is done at the table level, but may also specify an index. Some kinds of reorgs affect only tables or only indexes. A reorgchk can be run either at the table or the database level to help determine which kinds of reorgs are needed for what tables. Reorgchk requires current runstats to correctly make these determinations. Some kinds of reorgs make a table completely offline, while others keep the table online for a portion of the reorg.

Purposes of Reorgs

Data Clustering

Clustering Index

A clustering index in DB2 means an index that is defined with the CLUSTER keyword. Clustering indexes means that DB2 attempts to maintain the table in the same order as the index. It is not guaranteed to be in the same order, though. If the data will not fit on the page that the clustering order would prefer, then DB2 will try nearby pages, but will not move data around to make it work. There is no clustering by default like there is for some other RDBMSes, like MS SQL Server. If you reorg a table with a clustering index, DB2 will reorder the table data to exactly match the order of the clustering index. If you add a clustering index, you will have to reorganize the table before the data will be reordered.

Clustering a Table Over an Index that is Not Defined as a Clustering Index

If you do not have a clustering index on a table, you can still use reorg to order the data in the order of any index, simply by specifying it in the reorg command. If you do this, DB2 makes no attempt to maintain the table data in the order of the index – it is only on reorg that this is done. If there is a clustering index on a table, you cannot reorg the data on a different index.

Grouping Empty Pages and Physical Continuity of Table Data

Over time, as rows are deleted from a table, the space that those rows took up is not freed. It may be reused by rows inserted into the tables, if they fit and APEND mode is not used for the table. But especially if a large amount of data has been deleted, the data on the table pages may actually be sparse. One of the main things that reorg does is to completely fill up the tables (respecting PCTFREE), and push all of the empty space to the end of the table. Depending on the reorg options you choose, it may also free that space up to be used by other tables in the tablespace. Other than freeing space on disk, the other advantage here is that data that may be frequently accessed together (by table scans or by prefetching) is physically located together on disk.

Compression Dictionary Generation

There are a number of times that compression dictionaries are generated for DB2 row level compression, and it may depend on your version. But there are situations where you may want to explicitly reset/recreate the dictionary, and there is an option on the reorg command to reset the compression dictionary

Reorg-requiring operations

There are some operations that require a table reorganization after they are done. These include:

  • Data type changes that increase the size of a varchar or vargraphic column
  • Data type changes that decrease the size of a varchar or vargraphic column
  • Altering a column to include NOT NULL
  • Altering a column to inline LOBS
  • Altering a column to compress the system default or turn off compression for the system default
  • Altering a table to enable value compression
  • Altering a table to drop a column
  • Changing the PCTFREE for a table
  • Altering a table to turn APEND mode off
  • Altering a table or index to turn compression on

Note that some of the above are “reorg recommended” operations, where you can do up to 3 before a reorg is required, and you are limited in some of the operations that can be done against the table until a reorg is completed. Others require immediate reorganization before any actions may be performed against the table. Other things in the list above simply won’t get the benefit of the change until the reorg is completed.

Index maintenance

In general, one of the purposes of reorgs may be to rebuild indexes. How, when, and even if this happens depends on the type of reorg you are running.

When rows are deleted from indexes, the RIDs are essentially marked as deleted, and not actually deleted. When they’re marked this way, they’re called pseudo deleted RIDs. There are two ways that these index entries are finally deleted. If MINPCTUSED is not set for the index (which is the default), then reorg can clean them up. When invoked for indexes, reorg can rebuild all of the indexes or it can go through a cleanup only mode to simply remove the pseudo-deleted rows and pages. This reorg process can solve problems with leaf page fragmentation, low-density indexes, indexes with too many levels, and also will respect PCTFREE – to ensure there is free space on index pages for additional inserts.

When reorging indexes for a table, you can only reorg all indexes for a table, and not an individual index. The exception is when you’re reorging non-partitioned indexes on a partitioned table.

Summary

This blog entry covered what reorgs do at a high level. Future entries about reorgs will cover reorgchk and how to determine what tables to reorg and the syntax and specifics on the various types of reorgs.

Ember is always curious and thrives on change. She has built internationally recognized expertise in IBM Db2, and is now pivoting to focus on learning MySQL. Ember shares both posts about her core skill set and her journey learning MySQL. Ember lives in Denver and work from home

19 comments

  1. TABLE A has 2 indexes.
    While doing offline reorg on TABLE A the 4th phase is “RECREATE ALL INDEXES”.
    Question : so “RECREATE ALL INDEXES” is same as “reorg indexes all for table A” ?
    Or do I need to to index reorg separately after the offline table reorg?

    1. Index reorgs are included in offline reorgs. You only need to explicitly do them after online reorgs.

  2. Hello,

    In case of a huge table that must need be reorganized, but is taking long time to complete, do you have a idea what to do or do another action enstead of perfrom the reorg?

    Thanks

    1. There are a couple of options. The table can be partitioned, and then the partitions reorganized independently. However, I believe that partition reorgs are only classic (offline) at this point. I think they’ve said that inplace partition reorgs are possible on DB2 11.1, which is coming out next week. You can also use admin_move_table (which has some restrictions on 9.7 and early fixpacks of 10.1) to move the table, thereby reorging it at the same time in a largely online manner. Also, classic(offline) reorgs that do not cluster on an index use the least time.

  3. Hi DB2 Experts/Friends,

    User deleted lot of records from a table but we’re unable to reclaim space. Need your help on this.

    We’ve table TABLE1 with 4 BLOB colums and table was created in only one 8K tablespace{TABLESPACE1].

    the table was created with below BLOB column datatypes:

    IMAGE SYSIBM BLOB 10485760 0 Yes

    IMAGE1 SYSIBM BLOB 10485760 0 Yes

    IMAGE2 SYSIBM BLOB 10485760 0 Yes

    IMAGE3 SYSIBM BLOB 10485760 0 Yes

    The table TABLE1 was created in TABLESPACE1 and we didn’t see any free pages to reclaim for this.

    TBSP_NAME TBSP_TYPE RECLAIMABLE_SPACE_ENABLED TBSP_FREE_PAGES(KB)

    ————————- ———- ————————- ————————————————–

    QCASNKA DMS 1 2944

    The below output from SYSPROC.ADMIN_GET_TAB_INFO for this table.
    All sizes are in KB:

    DATA_OBJECT_L_SIZE DATA_OBJECT_P_SIZE LOB_OBJECT_L_SIZE LOB_OBJECT_P_SIZE

    ——————– ——————– ——————– ——————–

    8451840 8451840 2191217704 2191217920 KB

    1 record(s) selected.

    Physically disk space allocated for LOB data= 2191217920 KB [2089 GB]

    Currently allocated used LOB data for 4 BLOB columns[IMAGE+IMAGE1+IMAGE2+IMAGE3] is = 930 GB

    We can reclaim around 1159 GB by performing the Reorg on LOB columns.

    Need your suggestion how to reclaim space this space.

    1) Do we need downtime for this table?
    2) Still we’ve 1 TB free available at data filesystem, Is that space good for performing the reorg on LOB column for this table ? or Do need additional space on data filesystem [ Around 2 TB] ?

    3) Please give syntax for reorg for LOB columns ? both offline and inplace
    4) Is there any option to reclaim space on LOB columns ? like admin_move_table ( share commad as well ) ?

    Thanks
    Lokesh

    1. The rules then are the same as the question – did I need to do a Reorg/Runstats on the database at the time of the backup or the time that the database was rolled forward to? The Restore does not change the need for reorg/runstats.

  4. What advice would you give for a really large table 400+m rows which takes so long to online reorg it’s longer than the allowed maintenance window (it’s more than 1 days)?
    Unfortunately there’s no opportunity for a outage as it’s a critical 24/7 system. 10.5 on linux.

    1. Inplace reorgs greatly reduce the down time for a table. Option #1 is to use inplace reorgs with the “notruncate” option, which should be fully online, and then do the index reorgs separately during your maintenance window, because they require an exclusive lock even when they are done “online”. Option #2 is to range partition your table. I think you can then reorganize each partition individually.

  5. Thank you so much. I will look into option 1 but it seems from my readings that option 2 you cannot do online reorgs
    In your experience is it unusual for a table of this size to take almost a day to inline reorg? It seems very wrong.

    1. I believe they added online reorgs for partitions in 11.1. It is not unusual for reorg to take longer than 6 hours for large tables. It depends on the size of the table, not just the number of rows. Are you reorging when a reorg is truly needed?

      1. Unfortunately we will not move to 11.1 for some time. These tables have large deletes and many inserts and show 4-5% overflows. The problem is we are forbid from running outside window so jobs are often killed.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.