DBClean – Guest Users

Posted by

This is part of a series focusing on the details of each item that needs to be pruned in a Commerce database.

Guest Users

What is it?

Guest Users are created by Commerce in the database. Technically they are only needed for when you allow a customer to add items to their cart before logging in or registering, or for a few other actions. In reality, if the developers have not properly used/overridden the ‘isGeneric’ method, then you can end up with tens of thousands of guest users per day. The problem is that it takes a very, very, very long time to delete any user in a Commerce database, because for whatever (really stupid) reason, IBM chose to make just about every table have a foreign key to the member or users table. So if you delete one row from the users or member table, it’s really deleting from or looking for dependencies in 300 or more tables.

Potential for impact

The users and member tables are frequently two of the top 10 Commerce tables in a database in terms of size. The real impact here is that it is possible to create guest users faster than you can actually delete them. That’s right, even if you run DBClean 24/7, sometimes you still cannot delete them as fast as you add them. This is one area where it is critical to talk to your developers and get them to properly reduce the creation of guest users. On three different clients, I’ve had to take the database offline for up to 6 hours and perform a much riskier operation to get rid of the extra guest users. Even once you’ve done that, you may have to spend hours every night just deleting guest users – I had one client where it was 4-6 hours every night just to delete the guest users.

Since the users and members tables are so heavily used, anything you can do to make them smaller is a good thing for database performance.

Where this Applies

This applies to nearly every Commerce database. I believe I once saw a B2B Commerce database without guest users, but the dozens(hundreds?) of others I’ve worked with all had guest users and needed them to be pruned.

Table (other tables may be involved in cascading deletes)

MEMBER (cascades to USERS and hundreds of other tables)

SQL

note: always test for yourself, don’t take my word for it

note also that you should be cautious running the SQL during peak loads, and always use “with ur”

Number of rows to be deleted:

select count(*) from member where member_id in ( select users_id from users T1 where registertype='G' and (days(CURRENT TIMESTAMP) - days(prevlastsession)) >= <days_to_retain> And not Exists (select 1 from orders where orders.member_id=t1.users_id and status != 'Q') and (users_id > 0)) with ur

If your Commerce database does not take orders (rare, but I’ve seen it), then you can vastly improve performance by tweaking the sql in WSCOMUSR.CLEANCONF in a custom dbclean job to eliminate the subquery for orders.

Pruning method

You can use dbclean to prune this one. Object=user and Type=guest

If you wanted to prune it manually, you could, keeping in mind all the gotchas for big deletes – you’d probably want to break it up. You don’t want to run it as one statement, but it would be:

delete from member where member_id in ( select users_id from users T1 where registertype='G' and (days(CURRENT TIMESTAMP) - days(prevlastsession)) >= ? And not Exists (select 1 from orders where orders.member_id=t1.users_id and status != 'Q') and (users_id > 0))

This statement changes from version to version of Commerce, so be sure to look in the CLEANCONF table for the best statement.

Determining Retention

Guest users are obsolete pretty quickly. The actual time depends on how long before the cookies expire. Once a user’s cookie has expired, he cannot return to the guest user, and the guest user cannot be otherwise accessed by Commerce. Some sites actually allow guest users to complete orders – this pruning area will not delete users with orders. Orders in a commerce database include carts, so you may want to prune abandoned shopping carts before pruning guest users

What happens if you get behind?

By this, I mean what if you can’t delete the guest users as fast as they are created or you get to a point where you could run dbclean for days and still not delete all the users you need to. Sometimes you’re even deleting more than 50% of the tables. Guest users is the most likely area for this to happen. If you get to that point, then you have to essentially stop Commerce, export the data from the Users and Member tables that you want to keep and then LOAD that data back into the tables using the replace keyword. LOAD is required because IMPORT won’t let you do replace with the RI. After the LOAD, you have to set integrity (on approximately 300 tables – better have it scripted), preferably using exception tables, and then re-start Commerce. NEVER do this without testing thoroughly – prefferably with a full copy of the database on a development environment. This will give you timing information, but also help you learn important lessons (pay special attention to every row kicked out in the exception tables by the set integrity statements). I’ve done it three times, and had to tweak it each time as to what statements I was using because there were different pieces of data that got kicked out during the set integrity. Remember never to delete users with negative member_ids or users_ids as a starting place.

This method is risky. You’re fairly likely to accidentally delete something you needed – I’ve never had it just right on the first try, and I believe in one case went through 5 iterations before I had it right.

Gotchas

So there are some users that don’t have a value in prevlastsession or lastsession. Generally, you would want to give them a date there, BUT if you give them all the same date, then they’ll all come up with the same deletion date too, and you could end up with a long running dbclean that day. This is rarer with Guest Users than registered users.

I’m not sure this next thing was with Guest or Registered users, so I’ll mention it in both places. Sometimes users were created by a call center, and have never technically logged in, but should still not be deleted (they may have orders, addresses, etc). In this case, you do not want to assign them a prevlastsession or delete them.

So, good information? More detail needed in any areas? Let me know what you think.

Ember is always curious and thrives on change. She has built internationally recognized expertise in IBM Db2, and is now pivoting to focus on learning MySQL. Ember shares both posts about her core skill set and her journey learning MySQL. Ember lives in Denver and work from home

10 comments

  1. Hi Ember,
    What is the isGeneric method you referred to here. We don’t allow Guest users to add items to Shopping cart. They must first register. However I see ten of thousands of Guest users being created per day in our db. Do these rows need to be created. If not, how can we prevent.

    Thanks for your time.
    Regards,

  2. Hi Ember,

    Thanks for proivding this information and the link to the IBM support article, one again its all very useful information that i’ll be using over the coming weeks to cleanup a system thats been live for 2 years without sufficient housekeeping.

    Kind regards,

    Brian

  3. Hi Ember,

    I’m relatively new to this shop and one of the first things I looked into was the purge process. The USERS and MEMBER tables are growing fast and they both have over 400 million rows each. 80% of them are stale guest users. I tried different ways to clean this up, but due to the huge number of foreign key constraints, it’s extremely slow. I thought of creating a new table, load only the data that’s needed and rename the tables. But due to RI, that’s not possible. IBM suggested offline purge, but it’ll need an outage. Is there any other way to purge this data without taking an outage?

    Thanks !!

    1. The offline purge is really the only way to deal with bloated users/members. That usually consists of:

      1. Take the apps offline
      2. Export the data you wish to keep
      3. LOAD/REPLACE the data into the members table
      4. Set integrity on tables placed in check pending
      5. Start the apps again

      To have a realistic estimate of how long this will take requires a full copy of the database somewhere that you can test it. Testing is absolutely critical. Due to the FK relationships, lots of data in other tables will be deleted if you delete the wrong user/member.

      Often the set integrity step is the most time consuming, and you likely want to write a script that will detect tables in check pending state, and set integrity on those tables. Run this script multiple times until no tables remain. In some environments, there may be tables with circular integrity references, which requires that a single set integrity statement includes multiple table names. When you test it, write exceptions to tables so you can get an idea of what all is being deleted.

      There is a way that developers can write things that creates a guest user, and a way that it doesn’t. Often, developers use the wrong one, and create a ridiculous number of guest users as a result. Getting them to stop that can make a huge difference in the rate of guest user creation.

      It is a risky and a difficult process. And yes, the deletes are slow – I used to assume I could count on about 150 rows deleted per MINUTE because of all the cascading. You’ll likely see numbers better than that.

      I wish I had a magic wand for you on this, but I don’t.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.