Edited on 7/18/2012 to reflect lesson learned from a commenter.
First, a disclaimer. This process is only going to work for tables in AST tablespaces. It also will only work for tables that are not MDC, and do not have a clustering index. It will work for most WebSphere Commerce 7 tables. This is intended as a simple instruction set – assuming you want to delete a bunch of data from a table, and then release some space back to the filesystem.
Delete Your Data
The first step is obviously to delete the data. Depending on the amount of data you’re talking about, and what tables, this could be difficult and time-consuming and use a bunch of log files. In databases with a lot of RI (like WebSphere Commerce databases), some deletes take a very large amount of time (like deletes from MEMBER or USERS in a WebSphere Commerce database). If it’s an option to delete everything, consider using the TRUNCATE TABLE command.
If I simply have to delete a large amount of data, I use a trick with MOD to split up the deltes. It probably also slows them down, but if you have a unique or evenly distributed numeric field, it works well. First, I figure out how many rows I want to delete, then divide that number by the size of each delete – I usually go with 10,000 or even less if there’s a lot of RI on the table. Say this comes up with 4 – then I execute delete commands that look like this:
delete from wscomusr.staglog where stgprocessed!=0 and stgfilter is null and stgstmp < current timestamp - 14 days and mod(stgrfnbr, 4) = 0; delete from wscomusr.staglog where stgprocessed!=0 and stgfilter is null and stgstmp < current timestamp - 14 days and mod(stgrfnbr, 4) = 1; delete from wscomusr.staglog where stgprocessed!=0 and stgfilter is null and stgstmp < current timestamp - 14 days and mod(stgrfnbr, 4) = 2; delete from wscomusr.staglog where stgprocessed!=0 and stgfilter is null and stgstmp < current timestamp - 14 days and mod(stgrfnbr, 4) = 3;
Of course, this assumes you leave autocommit on. My data pruning script automatically generates the right number of statements. But you can also fairly easily use Excel and the concat function to generate the statements if you have a large number to generate. I'm currently running a loop with 14,670 groups on one of my servers.
You can delete from a select like this:
delete from (select * from wscomsur.staglog where stgprocessed=1 and stgstmp < current timestamp - 14 days fetch first 1000 rows only); delete from (select * from wscomsur.staglog where stgprocessed=1 and stgstmp < current timestamp - 14 days fetch first 1000 rows only);
My data pruning script automatically generates the right number of these, but as the helpful commentor notes below (thanks, Ian!), you can also loop through until you get SQLSTATE 02000. This method is an order of magnitude faster than the old method I offered, and still breaks up data to prevent filling up your active transaction logs. I used this method for the same delete I was running for about 7 million rows in the STAGLOG table, and it completed in less than an hour what had been taking days - so it's at least an order of magnitude faster. I'm altering my data pruning scripts to use it.
Reorg the Table
After your delete is complete, assuming you're using AST tablespaces (DMS are similar), you will not see any change in the amount of space taken up in your filesystem. In order to see the difference, you must first reorg the table. You can do this online or offline. The advantage of offline is that it takes less time. The disadvantage is that you need up to 3 times the size of your table to reorg it, which may or may not be reasonable depending on your situation. The advantage of online reorgs is that you don't need a large amount of free space to reorg it. However, online reorgs can use a vast amount of transaction log space - with the table I'm currently working on, I have to compress my archived transaction logs every few minutes while an online reorg is running to avoid filling up my logging filesystems. The syntax for an online reorg looks like this:
> db2 "reorg table STAGLOG index SQL120615064145450 inplace allow write access" DB20000I The REORG command completed successfully. DB21024I This command is asynchronous and may not be effective immediately.
The online reorg returns to the command line immediately. You can monitor its progress by using db2pd:
db2pd -d wc037d01 -reorg |grep STAGLOG 0x00002B4B2D2B3EA8 3 41 n/a n/a n/a STAGLOG Online 1 3 0x00002B4B2D2B3EA8 STAGLOG 06/28/2012 00:07:14 06/28/2012 00:35:19 n/a n/a n/a 531704 531704 Done 0
In the second row of output, you're looking at the numbers towards the end (scroll to see them above) - in this case, 531704 and 531704. The output above is after the reorg completed, so it says "Done". When it's in progress, you'll see "Started", and the first of those two numbers will be lower than the other. In the case of a delete, the reorg will complete before the first of those numbers reaches the second. In this particular reorg, that second number started above 600,000. The numbers represent the number of pages processed and total number of pages to be processed.
Now, remember that an online reorg is not 100% online unless you also specify the "notruncate" keyword on the reorg. However, in this case, the whole point of the reorg is to tructate the unused space, so we must be able to tolerate the short lock drain and exclusive lock at the end of the online reorg.
Also, remember that you should do a runstats both before AND after the reorg. My favorite runstats syntax is:
db2 runstats on table STAGLOG with distribution and detailed indexes all
Reduce the Tablespace
After the post-reorg runstats completes, you'll want to execute the command to reduce the tablespace size. Assuming you're using AST tablespaces, you'll want to do:
> db2 "alter tablespace tab8k reduce max" DB20000I The SQL command completed successfully.
Obviously, you have to know which tablespace a table is in for this to work. The TBSPACE column of SYSCAT.TABLES can help you find this, if you don't already know.
Lather, Rinse, Repeat
In my case, I have a bunch of deletes I'm doing for a single multi-million record table. Unfortuneately, I can only run deletes between 11am and about midnight. I can only delete a million records a day. So every morning I kick off the deletes, and every evening, I stop them and run a reorg and runstats. My other option would be to arrange an outage for the table (which is possible on this table), and export only the data I want to keep, then import/replace it back in. This particular situation is right on the cusp of where that's worth it. I can complete the deletes in a week, so I'm going with the daily restarts of the process.
I wrote this little sp for deleting based on committing every x rows. This avoids problems with active log full conditions and lock escalations when millions of rows are being deleted.
CREATE PROCEDURE DELETE_TABLE_COMMIT_COUNT (
IN TABSCHEMA VARCHAR(128)
,IN TABNAME VARCHAR(128)
,IN DELETE_PREDICATES VARCHAR(256)
,IN COMMIT_COUNT BIGINT
DYNAMIC RESULT SETS 1
DECLARE AT_END INTEGER DEFAULT 0;
DECLARE del_stmt VARCHAR(512);
DECLARE CONTINUE HANDLER FOR NOT FOUND
SET AT_END = 1;
SET del_stmt = ‘DELETE FROM ( SELECT 1 FROM “‘||TABSCHEMA||'”.”‘||TABNAME||'” WHERE ‘||
DELETE_PREDICATES ||’ FETCH FIRST ‘||CHAR(COMMIT_COUNT)||’ ROWS ONLY)’;
PREPARE prepared_delete FROM del_stmt;
IF AT_END = 1 THEN
END LOOP delete_loop;
We are using a similar query to delete data from a db2 table . Every day we delete half a million records from the table everyday. What would be the ideal commit points in such cases?
While we run this delete, there are 7-8 sessions which run in parallel.
I think that you simply have to experiment to find the point that is ideal for you in terms of speed. I don’t think there is a solid answer to your question.
Nice post. Regarding the deletes, I advice you to look up Serge Rielau’s SQL On Fire presentation, there’s some good info there.
Regarding the runstats, why do you perform one _before_ the reorg?
I guess I’m in the habit of the before runstats – for normal regular maintenance, it would go runstats, reorgchk, reorg only tables needing it, runstats(on reorged tables), rbind, but I guess if you take the reorgchk out of the mix (as in this scenario), then you might not need the before runstats either.
To save you some time from Frederik’s post, Serge suggested doing:
delete from (select * from YOUR_TABLE fetch first 10000 rows only)
Add predicates to the select statement and change the fetch first clause to taste. It’s pretty easy to wrap this into a loop that repeats until you get SQLSTATE 02000.
If anyone wants to know why I blog, this is one reason. It makes me question my own long-standing practices and find new and better ways. So much faster this way – thanks, Ian!
I have refactor’d DBClean. I use session.InStorageTables to store the Primary Keys of the selected ‘eXpired data’. Spin around the session table of PK’s PrepareStmt & commit in a LUW. Nice & fast, central logging, limited concurrency issues.
I have similar situation where I have to do large deletes and want to avoid downtime due to reorg(offline). Tables are huge and want to avoid online reorg as well as it would take a long time.
Can I skip the reorg after doing the deletes and just do the runstats which is online ? Will it cause performance issues or will access plans change if I do not run the reorg ?
The space will not be freed up outside of the table until you do either an online reorg without notruncate or an offline reorg.