DB2 LUW – What is a Page?

The logical view of a database consists of the standard objects in any RDBMS – Tables, Indexes, etc. There are a number of layers of abstraction between this and the physical hardware level, both in the OS and within DB2.

Setting the Page Size

The smallest unit of I/O that DB2 can handle is a page. By default, the default page size is 4096. The default can be set to one of the other possible values when the databases is created using the PAGESIZE clause of the CREATE DATABASE statement.

The possible values for page size in a db2 database, no mater where it is referenced are:

  • 4K or 4,096 bytes
  • 8K or 8,192 bytes
  • 16K or 16,384 bytes
  • 32K or 32,768 bytes

Realistically, the page size is set for either a buffer pool or a tablespace. Setting it at the database level on database creation just changes the default from 4K to whatever you choose and changes the page size for all of the default tablespaces and the default bufferpool.

The page size for each bufferpool and tablespace is set at the time the buffer pool or tablespace is created, and cannot be changed after creation.

Tables can be moved to a tablespace of a different page size after creation using the ADMIN MOVE TABLE command, but that operation requires at the very least an exclusive lock, and may not support RI – I hear RI support is added in 10.1 Fixpack 2.

Choosing a Pagesize

In my experience, it is rare to have a database created with a different default page size. Every database I currently support has the default page size of 4K, and also has at least one tablespace and one bufferpool with each of the other page sizes.

The most common time you think about page sizes is when you’re creating a table. When DB2 stores data in a table, it CANNOT have a row size larger than the page size minus some overhead. So if you have a row greater than 4,005 bytes in width, you simply cannot keep it in a tablespace with a page size of 4K. The row size does not include large LOBs that you do not wish to in-line. But it does include the maximum value of every varchar field.

This is one area where DB2 differs from Oracle. To my knowledge, in Oracle, you can have a row that spans multiple pages. From what I hear, DB2 is planning to support that in a future release, but they’re planning to do it by plopping the rest of the row in a LOB – there was an audible groan when they announced that in a conference session I was in, due to the other problems in dealing with LOBs.

It is also important to think of the geometry of the table when choosing a page size. If you have a row size of 2010 bytes, that means that only one row will fit on every 4K page, and therefore nearly 50% of the space allocated to your table will be empty and wasted. A row size that large would do much better on an 8K or 16K or even 32K page. It is important to consider this for every table as you create it, and to revisit it with any column length alterations you make.

I have, on several different occasions, had a request from a developer to increase a column size, and have had to move the table to a different tablespace to accommodate the change because the increase pushed the row over the limit. Historically, moving a table between tablespaces has required the table to be offline – which can be problematic in a live production database. If you’re properly testing changes in at least one development environment, you will discover these kinds of issues before they get to production.

While page overhead is not exactly 100 bytes, it is usually not far from it, so to determine how many rows will fit on a page, you can usually use:

(pagesize-100)/maximum row length

Again, this does not count LOB data, but only LOB descriptors. LOB descriptors are stored on the data page. The remainder of the LOB is stored in another location, assuming you have not INLINED the LOBs. From the standpoint of a page, the main reason for using a LOB is to allow a large portion of unstructured data to be stored elsewhere – not directly on the data page. LOB descriptor sizes on the page depend on the size of the LOB and vary from 68-312 bytes.

Generally, smaller pages are preferable for OLTP and e-commerce databases because they allow you to handle less data at a time when you’re expecting smaller queries.

The total table size is another factor in choosing a page size. New tablespaces should generally be created as “LARGE” tablespaces. But “REGULAR” used to be our only option, and for REGULAR tablespaces with a 4K page size, the limit on table size in a DMS tablespace is just 64 GB (per partition). On more than one occasion I have dealt with hitting that limit, and it is not a pleasant experience. For LARGE tablespaces, the limit per partition for a 4K page size is 8 TB – much higher.

Since pagesize is set at the tablespace level, you can also consider the appropriate page size for your indexes, assuming you’re putting them in a different tablespace than the data. While you cannot select a tablespace for each index, but only an index tablespace for the table as a whole, you’ll want to consider possible indexing scenarios when chossing where your indexes go as well. The limit for index key (or row) size in db2 9.7 is the pagesize divided by 4. So for a 4k page size, it is 1024. Back on DB2 8.2, the limit was 1024 for all page sizes.

An interesting side note: if you’re not familiar with the “SQL and XML limits” page in any DB2 Info Center, I recommend looking it up and becoming familiar with it. That’s where I verified the index row limit and there are all kinds of interesting limits there.

Data Page Structure

Ok, this section is admitedly just a tad fuzzy. I had the darndest time getting information on this, even reaching out to some of my technical contacts at IBM. But I’m going to share what I do know in case it helps someone.

Each data page consists of several parts. First is the page header – a space of 91-100 bytes reserved for information about the page. The page header identifies the page number and some other mysterious information I cannot find much detail on.

Next comes the slot directory – which is variable in size, depending on how many rows are on the page. It lists RID’s that are on the page and the offset on the data page where the record begins. if the offset is -1, then that indicates a deleted record. In the structure of the rows themselves on the page, it appears that the records “start” at the bottom of the page. There may be open space between records due to any of the following:

  • Deleted records
  • Reduced size of VARCHAR values
  • Relocation of records due to increased size of the VARCHAR that will no longer allow the row to fit on the page

This open space cannot be used by additional records until after a reorg.

Finally, there may be continuous open space on a page that is left over or simply not used due to either deletes followed by reorgs or due to the pages simply not being filled yet.

DataPage

I found some references to each page also having a trailer, but they were not from sources I could consider credible, so there may or may not be a trailer on the page. Most of the information here comes from a page in the IBM DB2 Info Center. I would love to hear reader comments on this topic, or any references anyone may have with more detailed data.

Not every page in a table is a data page dedicated fully to user data. There is one extent of pages allocated for each table as the extent map for the table. Past a certain size, additional extent map extents may be required. There is also a Free Space Control Record every 500 pages that db2 uses when looking for space to insert a new row or move a re-located row.

Index Page Structure

Indexes are logically organized in a b-tree structure. Interestingly, the RIDs that index pages use are table-space relative in DMS tablespace containers, but object relative in SMS tablespace containers – perhaps this is one of the reasons there has never been an easy way to convert from SMS to DMS or vice-versa.

I have not been able to find a good representation of exactly what leaf and non-leaf pages look like for indexes. We do have the standard representation of pages in a b-tree index, that is:
b-tree

This shows us that we can expect to see index keys that delimit ranges on the root and intermediate non-leaf pages, and that on the leaf pages, we expect to see the index keys along with the RIDs that correspond to those index keys. There is still a page header that is between 91 and 100 bytes, But I don’t know if index leaf pages have the same slot directory that data pages do. Again, I welcome user comments and links on this topic.

Extent Size and Prefetch Size

The extent size and prefetch size are specified on a tablespace by tablespace basis(keywords EXTENTSIZE and PREFETCHSIZE on the CREATE TABLESPACE command) or as database-wide defaults (DFT_EXTENT_SZ and DFT_PREFETCH_SZ db cfg parameters). Extent sizes cannot ever be changed after tablespace creation. Prefetch sizes can be set to AUTOMATIC and be changed automatically by DB2, or can be altered manually using the ALTER TABLESPACE command. Both are specified as a number of pages.

The Extent size is the number of pages that are allocated to any objects in the each tablespace container at one time. The Prefetch size is the number of pages that are read into the bufferpool by the prefetchers for one read request. I’m not going to speak specifically to the tuning of these in this post.

Summary

By understanding pages, we come closer to understanding how DB2 handles some aspects of I/O. Minimally, a DBA needs to be able to pick the appropriate page size for a given table and it’s indexes.

Ember Crooks
Ember Crooks

Ember is always curious and thrives on change. She has built internationally recognized expertise in IBM Db2, spent a year working with high-volume MySQL, and is now learning Snowflake. Ember shares both posts about her core skill sets and her journey learning Snowflake.

Ember lives in Denver and work from home

Articles: 544

24 Comments

  1. Interesting article. So often, I have just used page sizes that were used in previous projects without giving them sufficient thought.

  2. Hello, just wanted to take a quick moment to tell you how much I appreciate your website. I have been a db2 dba for 20 years and I still find myself learning something new on your website and refer to it quite often 🙂 Thanks!

  3. Thanks for your informative post. It clarifies a lot of details about the physical storage in DB2.
    I just have a quick question/confusion:
    (Assuming I am using DB2 10.1 – with *regular* table spaces) I have read in the IBM documentation that there cannot be more than 255 rows on a page. (for example here: http://www.ibm.com/developerworks/data/library/techarticle/0212wieser/ ). I am just trying to compare this one with your formula in “Choosing a page size” which seems to consider the whole space in a page (minus 100 bytes of overhead). I am just wondering if that formula is only correct for large tablespaces, or if there is sth about regular tablespaces that I am missing here.

    • I would never use regular tablespaces anymore – only large ones.

      Large tablespaces have varying possible numbers of rows per page:
      Max rows per page

      There is a maximum number of rows per page, and for particularly narrow tables, that has to be considered.

  4. Hi Ember! This article you wrote still hot after 3 years. I have an additional question about performance when using larger tablespaces. Although I administrate a DB2 10.5 OTLP database with small tables that fit into a 4k pagesize tablespace, when I transfered them to a 32k prior to a compress and evaluated some performance tests, all DML operations performed a lot faster. It was not a problem with the old tablespace because I already have recreated maintaining 4k pagesize only changing from regular to large in order to allow more rows – it has reached regular maximum capacity. I have tested this approach in another databases, all of them OLTP and observed a reduction in I/O and faster SQL responses. All tablespace creation was made with default EXTENTSIZE and PREFETCH size, and all bufferpools are AUTOMATIC with STMM. I really cannot understand this, because it goes in the wrong way of all DB2 documentation. Is there any reason to see this increase in performance? BTW, all tables runs a lot of INSERT, UPDATE and DELETE and the tests considered them without compression.

    • Without asking a lot of questions, I could not speculate as to why. There certainly could be reasons that things would work that way, but you’re right that generally, for OLTP, smaller page sizes are thought to perform better. EVERY single recommendation depends on so many factors, that when you can do testing like that, it is excellent to do so. It could be something as simple as a 32K page size matching up better with your storage stripe, strip, or LUN sizes or any one of a dozen other things.

      • Thanks for the quick response. I take care of 4 different customers running the same ERP software with DB2 9.7 and DB2 10.5. They’re all different in hardware specs, and I performed a test with the same table in all environments migrating to a 32k large tablespace without compress. At one case, a very complex ETL SQL SELECT that take more than 1 hour to execute turns out to finish in 3 minutes after the change (and I have migrated to a new 4k tablespace before the tests). I have access to an test environment pretty near to the production and trying different combinations of EXTENT and PREFETCH sizes to find out what’s going on.

        • I am not surprised that a complex ETL SQL would perform better. However, the number one performance priority would usually be the transactions, and I’m curious as to the performance difference, if any for them.

          Queries dealing with larger amounts of data will generally benefit from a larger page size, particularly if the table is not very contiguous on disk.

  5. I am in process of database migration from DB2/VSE 7.5.0 to DB2 LUW 10.5. How can i optimize the selection of page size, extent size, prefetchsize for migrated database.

    • If the database being moved is a Data Warehouse or Analytics environment, then 32K is generally the page size to go with. If it’s OLTP, you may want to consider a smaller page size (4K or maybe 8K) for the tables that are particularly likely to singleton-row updates, inserts, and deletes. If you have the opportunity, it is best to test and compare configurations to see which is actually faster. Extent and prefetch size are going to be more dependent on your storage characteristics. If you can actually get details on your storage such as the strip size for a RAID array, then matching your extent size to to that is very useful. Prefetch size is often smaller for OLTP, and larger for DW/Analytics.

  6. All your posts are simple, informative and it leads to further investigation and research for me. Thanks a lot.

    In this post you have a statement “It is also important to think of the geometry of the table when choosing a page size. If you have a row size of 2010 bytes, that means that only one row will fit on every 4K page, and therefore nearly 50% of the space allocated to your table will be empty and wasted.”. I assume that you meant to say 2100 instead or 2010 as row size. 2100 row size of a record makes the 4K page small enough to accommodate 2 such records and that is why around 50% of storage a waste. Is my understanding correct?

  7. Thank you Ember for your articles. Two questions: 1) Can a page hold records of more than 1 table? 2) Are pages allocated contiguous basis: Meaning pages 1,2,3 hold table A and pages 4,5 hold table B or can it be pages 1,3,5 hold table A and pages 2,4 hold table B?

    My databases have all tables in one tablespace and inserts into a large table are slowing down. Am exploring moving the large table into a different tablespace catered for OLTP but would like to understand pages to determine if that will help

    • 1) No, a page is associated with one and only one table.
      2) Pages are contiguous for a table only when they were LOADed or REORGed. With normal activities, tables become fragmented in chunks of exetentsize. This fragmentation requires regular reorgs. It is often a good idea to split the few largest or most active tables into their own table spaces for the reason you note, and also to be able to isolate them in their own buffer pool. You can also consider enabling append mode, but this may lead to a lot of wasted space. Finally, for OLTP, data pruning is absolutely critical – an OLTP database should hold the minimum data needed. Often an ODS for storing older data is an excellent idea.

  8. Hi, Ember. Excellent article, as always.
    Hey, I seem to recall reading that, if you’re going to have a separate tablespace for indexes, the page size should match the page size for the data tablespace.
    But, now that I”m looking for it as reference, I can’t find it 🙂
    Does this sound familiar to you? Or am I dreaming that up?

    • I’m not the definitive expert on that since I rarely split indexes and data. However, I would think it would depend. The problem with a smaller page size is that you could conceivably create an index that wouldn’t fit, and therefore have an index that you couldn’t create. Indexes could benefit from a smaller page size in most cases, though.

  9. Hi Ember, I feel happy to come across this blog while searching for some information. This is one of the best informative blog I have come across! I am looking for some guidance here:
    As per IBM, when adding new columns to an existing table, the table doesn’t need to be REORG’d for up to three columns added. In case of huge tables with hundreds of million rows, when a new column with some default value is added, where this new piece of data for each row is stored on the pages? What is the best practice in this case?

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.