So believe it or not, when I was a newbie dba doing the very basics on internal databases at my first job, I didn’t even know what some of the databases I supported did. I learned over time, but sometimes it was months before I knew or understood. That wouldn’t happen with my current level of experience, but that’s how it was. Now when I go into a new situation with existing databases, I investigate and ask questions so I know what the various databases are.
There are things you do differently depending on whether a database you support is OLTP or DSS/DW or somewhere on the spectrum between the two. There are also things you do differently for vended databases vs. custom databases. WebSphere Commerce databases are vended databases, and their primary load is OLTP/e-commerce. Sometimes they have a secondary role for reporting and DSS, but that’s not generally ideal.
So I thought I’d go over a few basics of an e-commerce DB2 database – what it is and what that means.
Highly available
If customer-facing e-commerce database is down, you’re loosing money. Plain and Simple. Even if it’s 2 am, there could be one or two orders depending on the site. If it’s a global site, even more so – there is no slow time for some sites.
So if they’re smart, a business will actually specify and engineer for 4 or 5 9’s of availability from the very beginning of planning. From what I’ve seen, whether it’s planned for or not, availability is expected to be nearly 100%. This means to avoid having the database as a single point of failure, you’ll need some kind of high availability solution. With DB2, this is usually HADR. I’d love to get my hands on PureScale, but haven’t had the opportunity yet. EEE/DPF is NOT a solution in this area. I imagine Data propagator could be used in this manner, but it’s more likely to be used to move data into another system for reporting purposes.
This also means you’ll need to be using archive logging and taking online backups, of course. Disaster recovery should not depend solely on HADR.
B2B e-commerce systems may have more flexibility – depending on the business and the geography covered. I’ve seen some where outages were not a problem as long as they were outside of Business hours in a particular time zone or small subset of time zones.
Singleton-row queries
The primary load for many e-commerce database are queries that return only one row or a small subset of rows that are frequently identified by a primary key or column with a high cardinality index. This is even more true for the tables where the performance is most critical – those involved in the checkout process.
There are, of course, queries that will return more rows – especially if a product catalog is stored in the database.
What this means is that when analyzing SQL, you should consider the role of the table when determining indexing. Just a few milliseconds may be worth indexing or tweaking SQL for, especially when the tiny query will be run millions of times an hour.
You’ll also want to carefully consider the timing and isolation levels of reporting-type queries against these tables – and whether to offload them entirely to the standby or to an ODS or DSS database.
Insert performance is critical
The performance of inserts into the tables involved in registration or checkout and other similar processes is also critical. This leads me to be very stingy with indexes on the tables that are involved in processes like checkout or registration. It has to help a very frequently executed query – less frequently executed queries are not worth indexing for because each additional index adds overhead on insert. For a query that runs once a week or once a month, you don’t want an index that allows index-only access for it while adding overhead to the 10,000 inserts you do every hour.
Concurrency is important
Hopefully you have an application that is well coded with concurrency and data integrity in mind. Concurrency is absolutely critical – the tiniest design changes or even indexing additions can make the difference between a site that is slower than molasses in January and one that’s clipping along at a good pace.
Ultimately you want to allow as many customers as possible to purchase the same product at the same time. Handling of sales or promotions is just as critical – if each person applying that 50% off coupon that was blast-emailed to 1.1 million faithful customers locks a row related to the promotion, you could have that alone locking up your database and your site.
This also means that current runstats are important(as if they weren’t for a dozen other reasons). If your runstats are not current, the optimizer may choose an access path that scans (and in some cases, LOCKS) whole tables.
Critical very high volume periods
E-commerce databases frequently have very high volume periods that are critical to the business. Black Friday or Cyber Monday are the most obvious, but a particular site may have certain holidays (Valentine’s day?) or certain merchant-specific sales that make or break their whole year. I have at least one client who does more business on Black Friday than every other day of the year added together, so if their database goes down on that day, it’s really bad news for the whole company.
It is critical for the DBA to be aware of when these periods are, and prepare for them (for Commerce, see Holiday Preparation). Preparation would include reviewing performance during regular load and looking very closely for problems in any of your standard areas. I review for problem SQL in addition to verifying all my automated maintenance is behaving as expected and doing a basic performance review.
Because of this, you also don’t want to be running your database server at 80% cpu utilization on a regular basis. If I check out CPU during my daily normal peaks, I’d like to see it at 35% or below. If I check it during particularly slow times, I’d like to see it below 10%. Similar reasoning goes for things like Buffer Pool hit ratios – the need to be awesome on a regular basis so they can go to just OK during a peak period. The hard numbers would obviously vary depending on your own implementation.
This also means that the most realistic load testing you can possibly afford is critical – Load Tests help find those critical problems that only show up under load. The DBA and the application experts should be involved in these, looking for bottlenecks at all levels. If possible, even test failovers under load so you know what to expect.