# Calculating Row Size for a Table

There are several scenarios with DB2 where we need to know what the “worst-case” row length is. That is, if all varying fields are completely full and any LOBs are maximally inlined, how much space could a single row actually take up? SYSCAT.TABLES gives us the average row size, which is useful for some use cases, but not all of them.

## Using a Stored Procedure

My top resouce for this is Serge Rileau’s Blog article with a great stored procedure definition. This stored procedure works great. The problem comes when I’m working in a client’s environment where they don’t want me to create any objects of any kind, much less stored procedures. To deal with this frequent consultant’s conundrum, I’ve had a goal for a while to work this into an SQL I can run without creating any objects. I sat down to do it several months ago, but didn’t have a solid couple of hours to dedicate to it, so came away feeling frustraed. However, I recently had time to sit down and work it out, and it wasn’t as complicated as I thought it would be.

## My SQL to Calculate Row Sizes

My goal is to get a query that will give me the table name and row size for every table in the database. Using a couple of common table expressions, I’ve developed the following SQL. It is very heavily based on Serge Rileau’s stored procedure, and I’ve successfully compared results on a couple of systems:

```with compr as (
select
TABSCHEMA
, TABNAME
, CASE WHEN compression in ('B', 'V') THEN 2 ELSE 0 END as compression_mod
FROM SYSCAT.TABLES
),
col as (
SELECT TABSCHEMA
, TABNAME
, COLNAME
,COALESCE(D.SOURCENAME, C.TYPENAME) AS TYPENAME
,COALESCE(D.LENGTH, C.LENGTH) AS LENGTH
,C.SCALE, C.NULLS, C.INLINE_LENGTH, D.METATYPE
,D.INLINE_LENGTH AS STRUCT_INLINE_LENGTH
, CASE WHEN C.inline_length <> 0
THEN C.inline_length
WHEN metatype = 'R' THEN D.inline_length
WHEN COALESCE(D.SOURCENAME, C.TYPENAME) IN ('CLOB', 'BLOB', 'DBCLOB')
THEN CASE WHEN COALESCE(D.LENGTH, C.LENGTH) <=       1024 THEN 68
WHEN COALESCE(D.LENGTH, C.LENGTH) <=       8192 THEN 92
WHEN COALESCE(D.LENGTH, C.LENGTH) <=      65536 THEN 116
WHEN COALESCE(D.LENGTH, C.LENGTH) <=     524000 THEN 140
WHEN COALESCE(D.LENGTH, C.LENGTH) <=    4190000 THEN 164
WHEN COALESCE(D.LENGTH, C.LENGTH) <=  134000000 THEN 196
WHEN COALESCE(D.LENGTH, C.LENGTH) <=  536000000 THEN 220
WHEN COALESCE(D.LENGTH, C.LENGTH) <= 1070000000 THEN 252
WHEN COALESCE(D.LENGTH, C.LENGTH) <= 1470000000 THEN 276
WHEN COALESCE(D.LENGTH, C.LENGTH) <= 2147483647 THEN 312
ELSE -2 END
WHEN COALESCE(D.SOURCENAME, C.TYPENAME) IN ('LONG VARCHAR', 'LONG VARGRAPHIC')
THEN 20
WHEN COALESCE(D.SOURCENAME, C.TYPENAME) = 'XML' THEN 80
ELSE 0 END as LOBLENGTH
FROM SYSCAT.COLUMNS AS C
LEFT OUTER JOIN SYSCAT.DATATYPES AS D
ON D.typeschema = C.typeschema
AND D.typename = C.typename
AND D.typemodulename IS NULL
AND C.typeschema <> 'SYSIBM  '
),
tot as (select
col.TABSCHEMA as tabschema
, col.TABNAME as tabname
,SUM(CASE TYPENAME
WHEN 'SMALLINT'        THEN length + compression_mod
WHEN 'INTEGER'         THEN length + compression_mod
WHEN 'BIGINT'          THEN length + compression_mod
WHEN 'REAL'            THEN length + compression_mod
WHEN 'DOUBLE'          THEN length + compression_mod
WHEN 'DECFLOAT'        THEN length + compression_mod
WHEN 'DECIMAL'         THEN TRUNC(length / 2) + 1 + compression_mod
WHEN 'CHARACTER'       THEN length + compression_mod
WHEN 'VARCHAR'         THEN length + 4 - compression_mod
WHEN 'GRAPHIC'         THEN length * 2 + compression_mod
WHEN 'VARGRAPHIC'      THEN length * 2 + 4 - compression_mod
WHEN 'LONG VARCHAR'    THEN 24 - compression_mod
WHEN 'LONG VARGRAPHIC' THEN 24 - compression_mod
WHEN 'CLOB'            THEN loblength + 4 - compression_mod
WHEN 'BLOB'            THEN loblength + 4 - compression_mod
WHEN 'DBCLOB'          THEN loblength + 4 - compression_mod
WHEN 'XML'             THEN loblength + 3 - compression_mod
WHEN 'DATE'            THEN length + compression_mod
WHEN 'TIME'            THEN length + compression_mod
WHEN 'TIMESTAMP'       THEN length + compression_mod
ELSE CASE WHEN metatype = 'R' THEN loblength + 4 - compression_mod
ELSE -3 END
END
+ CASE WHEN compression_mod = 0 AND NULLS = 'Y' THEN 1 ELSE 0 END) as row_size

FROM compr join col on compr.tabname=col.tabname
and compr.tabschema=col.tabschema
GROUP BY col.tabschema, col.tabname)
select
substr(tot.tabschema,1,24) as tabschema
, substr(tot.tabname,1,40) as tabname
-- comment the above two rows and uncomment the below two rows for full, non-truncated table names
--     tot.tabschema
--     , tot.tabname
, row_size + CASE WHEN compr.compression_mod <> 0 THEN 2 ELSE 0 END
-- uncomment the following two rows if you have created the stored procedure and want to compare results
--       , GetRowSize(tot.tabschema, tot.tabname) as sp_row_size
--       , GetRowSize(tot.tabschema, tot.tabname) - row_size + CASE WHEN compr.compression_mod <> 0 THEN 2 ELSE 0 END as diff
FROM tot join compr on tot.tabschema=compr.tabschema and tot.tabname=compr.tabname
;
```

Is this long and complicated? Yes, it is. The quick cheat I’ve run into others using is simply summing up length in syscat.columns for a table. That works as an approximation in some situations, but blows up when you hit XML, LOBS, and some other issues. I just can’t rely on a cheat like that on an ongoing basis.

The above is also less efficient if you were to run it for a single table than the stored procedure. Since my goal is a list of tables with their row sizes, this is not an issue for me. I would work more on optimization if I were running it for only a single table at a time and optimization mattered. Optimization for this will also matter more for environments with an excessively large number of tables, such as some SAP or PeopleSoft databases.

I’m also not sure how to work with this in an extended row size environment – I avoid extended row size for its performance implications. ##### Ember Crooks

Ember is always curious and thrives on change. She has built internationally recognized expertise in IBM Db2, spent a year working with high-volume MySQL, and is now learning Snowflake. Ember shares both posts about her core skill sets and her journey learning Snowflake.

Ember lives in Denver and work from home

Articles: 544

This site uses Akismet to reduce spam. Learn how your comment data is processed.