This post is specific to WebSphere Commerce. Stagingprop can fail for a dizzying array of reasons. I’m attempting to document some of them, since there’s no real reference that digs into the details.
What This Problem Looks Like
The error you get for this one looks like this(table name and values may vary):
(Wed Aug 08 14:44:20 CDT 2012) Consolidate all the unprocessed STAGLOG as a whole. (Wed Aug 08 14:44:20 CDT 2012) Retrieving unprocessed changes from STAGLOG ... (Wed Aug 08 14:44:25 CDT 2012) Consolidating retrieved changes ... (Wed Aug 08 14:44:27 CDT 2012) <Error> The sequence of the following operations in STAGLOG table is invalid: operation U followed by I. (Wed Aug 08 14:44:27 CDT 2012) 15372481: UPDATE table dmelementnvp - dmelement_id=25805, name=emsId, value=@new_emspot_id_1 <scope merchant> (Wed Aug 08 14:44:27 CDT 2012) 15372569: INSERT into table dmelementnvp - dmelement_id=25805, name=emsId, value=@new_emspot_id_1 <scope merchant>The command failed to complete the propagation job.
There’s no SQL error message here, because the failure happens during the consolidation phase. The error you’re looking for here is “The sequence of the following operations in STAGLOG table is invalid: operation U followed by I.”
Note that the operations listed can be different. The problem occurs when records in STAGLOG are instructing stagingprop to do one of the following:
- An update row for a given row (primary key) precedes an Insert row with no Delete in between
- A delete row for a given row(primary key) precedes an Update row with no Insert in between
Determining Root Cause
I can sure understand why stagingprop would not be able to handle this. During the consolidation phase, stagingprop eliminates duplicates. So if you go in and update the same row 12 times, it won’t try to propagate it 12 times, when all stagingprop cares about is the final state – so only one update is needed. It also can tell if you inserted and then deleted data, and can avoid doing the work altogether since the final state is the same as the starting state.
There are three main reasons this can occur:
Changed Primary Key
WebSphere Commerce does not support the changing of primary keys on stagignprop tables. This is very specifically stated in the Info Center (http://www.ibm.com/support/knowledgecenter/SSZLC2_7.0.0/com.ibm.commerce.admin.doc/refs/rsstagelimit.htm). Any changes you make through CMC and similar tools will handle this restriction for you – if you update part of a primary key, they will do a delete and insert behind the scenes. For this reason, WebSphere Commerce recommends that you not do manual database changes. We do manual database changes all the time, but we also run them through a dba and most often have dev/qa set up to stagingprop to test the prop of major manual changes. I’ve spent many hours working on problems with this kind of error.
Note also that we’re talking about not just the change of a primary key, but any part of a primary key. So if a Primary key has three columns and you change one of them, that will cause stagingprop to fail. Ever wonder why there are so many generated keys in WebSphere Commerce databases? This is one reason in my opinion. But not every primary key is a single generated column.
Records for Same Primary Key in Different Filters
This cause only happens if you’re using filtered stagingprops. If you don’t know what a filtered stagingprop is, then you’re probably not doing them. For filtered stagingprops, you set the stgfilter column in STAGLOG to a value (either through the staging triggers, manually, or some other process), and then use the -filter option when you execute stagingprop. It is possible for rows to have different filters – through a mistake in a manual update or a problem with triggers or code. This is especially possible if you’re updating STAGLOG manually to set a stgfilter.
If this is the issue, WebSphere Commerce is getting smarter on the reporting of the issue. It now actually gives you the filter that the conflict is in like so:
(Wed Aug 08 16:27:49 CDT 2012) <Error> The sequence of the following operations in STAGLOG table is invalid: operation I followed by U. (Wed Aug 08 16:27:49 CDT 2012) 15370728: INSERT into table catalog - catalog_id=11101 <scope merchant> <filter 1080151712> (Wed Aug 08 16:27:49 CDT 2012) 37553840: UPDATE table catalog - catalog_id=11101 <scope merchant>
In the case above, the stagingprop was actually run for filter “882012” – but notice the error on the second to last line notes the other filter that the conflicting row is in.
I’m not sure when this improved reporting was introduced – the example here is from WebSphere Commerce 7, I believe FixPack 4.
Incorrect Manual Updates to STAGLOG
Making manual changes to STAGLOG can be a valuable tool for dealing with stagingprop problems. But you have to really understand what you’re doing. If you mark the wrong row as processed, and not other related rows that correspond to the same primary key, you can get this error.
Resolution
The resolution here is dependent on the root cause.
Changed Primary Key
This is a painful one to get out of. I have literally spent hours resolving issues with this on multiple clients. You have to be very careful here. The best choice for resolution is a stagingcopy or backup/restore to get the two environments back in sync. You can attempt to sync individual tables via export/import, but RI can make that very difficult. You can also undo the changes in qa and mark all the records as processed.
Records for Same Primary Key in Different Filters
This is one of the easier ones to get out of. Usually it just involves changing the filter values manually in STAGLOG to match on the offending records. If you can identify them, then this isn’t usually too bad. As with all stagingprop problems, if things get too convoluted, you may end up doing a stagingcopy or backup/restore to get the two environments back in sync.
Incorrect Manual Updates to STAGLOG
Whenever I want stagingprop to pass over something, I mark it as processed in STAGLOG using a unique value greater than 1 (don’t use 1, -1 or -2, as Commerce may use these values) so I can go back and revisit if I have problems with the manual changes later on. If you’ve done this, it is easy to analyze on a row-by-row basis what action you need to take to resolve the issue. It can take a while to work through if there are many rows, and you may end up doing a stagingcopy or backup/restore to get the two environments back in sync.