As I’ve been learning the details of MySQL, there have been a lot of similarities with Db2, and a number of differences. Some of those differences have been things I was expecting, while others have been a bit more shocking. If others are making the transition from Db2 to MySQL (using InnoDB) or vice versa, or are supporting both, maybe something here will help.
I don’t expect this to be an exhaustive list, but just the things that have most stood out to me in the first 5 months. I also don’t think one platform is inherently better than the other. Every RDBMS has strengths and weaknesses.
Some of the shocking differences here are in MySQL’s favor! Maybe that’s the biggest shocker of all for some of my db2-loving readers.
Everything here is my own opinion and not an absolute statement of fact. I’m likely to have some details wrong.
Optimizer and Statistics
I knew the optimizer in MySQL would not be as advanced as the Db2 optimizer I know and love so well. I guess I didn’t really think through all that meant.
In Db2 you spend a lot of blood, sweat, and tears making sure your statistics are up to date and of the right types. You have techniques for advanced statistics like column group statistics and statistical views, and statistical profiles to manage these needs. When I understood that MySQL(5.7, InnoDB) only really estimates cardinality based on a very small sample size, and that’s it, I about had a heart attack. I’m used to in Db2 starting with full stats (not sampled) using
WITH DISTRIBUTION AND DETAILED INDEXES ALL. Like many learning experiences lately, I stepped back to look at why this seemed shocking to me, and why it is the way it is.
It seems to me like the optimizer is only somewhat cost-based, at least in 5.7, which is what I’m working with. I hear there is a release of 8 that makes some changes in this area. So if the optimizer is more rule-based than cost-based, all the distribution and fancier statistics don’t matter much.
Explain in general gives less information in MySQL, but this might also be due to the optimizer being a bit simpler. The “filtered” column means something somewhat different in MySQL than the filter factor in Db2.
A lot of the decisions I’m used to explaining to people (“Why did it choose a table scan when there’s an index?”) still have the same answers (“Because there’s so little data a table scan is faster”). And the optimizer still does that RDBMS magic of letting end users specify what data they want without having to know all the procedures for finding it on disk. There are still optimizer hints in MySQL and while it’s still a bit of an ugly cheat to use them, it’s not nearly so looked down upon as it is with Db2.
Years ago, I read that the table structure in MS SQL server included a clustering index on the primary key. Over the years, I’ve come to believe that’s not usually where I want my clustering index, but it’s not the most awful choice. MySQL stores the actual table data on the leaf pages of the primary key. I mean, guaranteed clustering, which Db2 doesn’t offer outside of MDC, that’s good, I guess.
But it takes a while to get used to this and its implications. Where Db2 uses a hidden RID (consists of the page relative to the table space for DMS or the table for SMS, plus the slot number on that page) in indexes to point to a specific row, MySQL simply uses the primary key. There are disadvantages to this, particularly if the primary key is large, but on the other hand, there isn’t really such a thing as a reorg, and the situations in which the roughly equivalent table rebuild is needed are much fewer and further between.
Perhaps this is a bit more of a practice thing as opposed to what is actually possible, but in Db2, just about every database I supported could be easily restored to any nanosecond (well, certainly millisecond, Db2’s time resolution strangely relies on the OS and yet can specify a more granular unit) in the last two weeks. To accomplish this, we used physical bit-level backups (or snapshots for the right storage type), and transaction logs. Db2’s transaction logs are not really human readable, though there is an API that can extract some data. They contain both undo and redo information, and are the source for some replication types.
MySQL uses a combination of undo records in a table space, redo logs, and then if enabled, bin logs. The bin logs can be parsed to be human readable, but with a complicated cluster of a reader and multiple writers, log management is difficult and point in time recovery is not really a regularly used thing. Generally it is possible, but in our environment, we have so many replicas in so many regions that it’s really a last resort and not something done every day.
Replication and Related Tools
Replication in MySQL could be considered any of four different products in Db2 – two of which are separately licensed. The products are HADR, SQL replication, Q-replication, and CDC replication.
Db2 uses HADR for having HA or DR standbys that are constantly replaying transactions for the entire database (db2 database, roughly equivalent to a MySQL instance/server). Read-only on the standbys is mostly hated and unused. Rolling patching (fix packs) can be done, but upgrade requires a full outage of the writer and all standbys. Standbys need to be very similar in configuration to the primary.
The three replication methods in Db2 read from the transaction logs or capture data changed to tables. Reads can be done on a replica easily. Changes in the structure of tables must be managed fairly manually. Generally both patching and upgrades can be done in a rolling fashion. Directing traffic to a new primary requires OS or application-level work – there isn’t a simple tool for that. The tools for managing any of them really stink, and have for decades. Managing replication takes significant DBA effort.
SQL replication consists of capture and apply processes that catch every transaction for a table or schema (equivalent to a database in MySQL) and apply it on another database. This is the only free form of replication, at least from a Db2 source to a Db2 target (please verify any cost statements with IBM before relying on them). Q-replication uses MQ series to queue transactions from the logs, and is much faster, but costs money and requires some expertise in MQ. CDC replication allows many more transformations than the other types, and also allows the most non-homogenous RDBMSes, but also has a separate costly license.
In MySQL, we enable replication by enabling the binlog, and use tools like Orchestrator (an open-source GUI) to make understanding configuration and failover easy. Tools like ProxySQL or Vitess(vtgate) can be used to direct traffic to the writer or readers. We can scale up the volume we can handle by adding more readers. MySQL allows many more replicas than Db2’s HADR. All standbys can be used for reading.
In my experience in the Db2 world, you’re much likely to scale up. Scaling out is much more expensive and is usually done with something like pureScale for oltp – which is a shared-disk solution, or DPF for analytics – a shared-nothing architecture. Both of these require a very high level of expertise to architect and run properly and likely run on dedicated hardware and not in the cloud.
I knew from previous work on isolation levels that the names were different for the isolation levels. That takes some in-head translation when discussing them, particularly as they both have a level named “Repeatable Read”, but they mean different things!
I had also read a critique of Db2’s implementation of MVCC a while back, and at the time did not understand it, and it hit me like a ton of bricks when a colleague was describing MySQL’s isolation levels.
Db2 only has MVCC with the Cursor Stability isolation level (the first one up from uncommitted read), with the CURRENTLY_COMMITTED db configuration variable enabled.
In MySQL, with MVCC, you get a consistent view of the database at a repeatable read level – not just the tables/rows you’ve accessed, but the entire database, without locks. In Db2 the “repeatable” read is only on the specific table/rows you have locked. This is an interesting distinction, and like many other shocking things I’ve learned, I have to step back and ask when it really matters.
A Few Less Shocking Differences
The items described above were the ones that really blew my mind at one point or another, but there are a few other specific differences that did stand out that deserve honorable mentions.
Buffer Pool Configuration
There are a lot of choices you can make with buffer pools in Db2. Numbers, sizes, page sizes, extent sizes, number of IO servers, number of page cleaners, alternate page cleaning methodologies, which tablespaces they’re associated with – there is a lot of fancy configuration possible. MySQL with InnoDB seems to mostly allow you to specify the total size of buffer pools, the number of buffer pools (evenly split from the size), and some details on page cleaning and prefetching. How often did I make more than passing use of those configurations in Db2? Maybe once a year?
Table Space Configuration
When I started working with Db2 20 years ago, we didn’t always even have RAID arrays, much less SANs. We did a lot of configuration at the table space level to stripe data across disks. Actually I’ve also used that in a cloud environment to stripe data across slower disks. In Db2 we also use this to speed up backups. Backups can only parallelize down to the table space level, meaning if data is evenly distributed across tablespaces, backups are faster.
In Db2, we pick one collation sequence at the database layer (roughly equivalent to a MySQL instance/server), and it can never, ever, for any reason be changed without dropping and recreating the entire database. In MySQL, not only can different tables have different collation sequences, but different columns in the same table can have different collation sequences.
Db2 has a few index options that MySQL does not. In Db2 we can specify different orders for each column. So I can create an index on (col_A asc, col_B desc). MySQL 5.7 does not support this. In Db2, I can also do a covering index by including columns in a unique index that are not part of the unique constraint. This is not possible in MySQL 5.7 InnoDB. I haven’t yet seen a tool in MySQL equivalent to Db2’s index advisor, but that doesn’t mean that one doesn’t exist. I also haven’t yet seen a way to run explains simulating indexes that do not exist or ignoring indexes that do exist – which I can do quite painfully in Db2.
If you’ve worked with both MySQL and Db2, please share the differences you’ve run into in the comments below!
I think this can be handy for some readers but in SAP DBACockpit utility, you can really easy simulate non-existing indexes or ignore existing ones. Some DBA’s ignore it.
Can you share your friends critique of Db2’s implementation of MVCC?.
“Table Spaces” With modern Linux LVM tooling, tables spaces look like a thing no longer needed, but Oracle is reintroducing them, and allows mapping of tables onto tablespaces made up from a multitude of files. It seems superfluous.
At the operating system level, having to deal with single very large files is sometimes a problem, especially when you need to clone a filesystem at the file level, in parallel. Many tools work better if you have a lot of reasonably sized files (“35 one TB files”) instead of few outsized files (“one 35 TB file”).
Also, all filesystems but XFS lock writes at the on-memory inode of a file. So if you have one 35 TB file in ext4, you can have file writes to this thing only at a concurrency level of one. That is still a problem.
XFS allows parallel writes, if the files are opened with O_DIRECT. It guarantees atomicity of file writes by having locking ranges for active writes at the in-memory inode. MySQL uses that to its advantage in several places, and XFS is in my experience the recommended filesystem if you have throughput problems that come from concurrency issues at the filesystem level.
There are other ways to work around that, and constructing a table space from many reasonably sized files (anything larger than 1 GB, for a total of no more than a three digit number of files) is one way to do that — you get a lock per inode, so let’s have more inodes. Partitions defined at the SQL level and MySQL innodb_file_per_table would also work, for the same reasons.
“Isolation” In InnoDB, the writer moves each row changed from the table space to the undo log. This is necessary for in-place updates, and to have the ability to have ROLLBACK at the same time. There is no way around doing this, unless you want to end up like Postgres with VACUUM.
Isolation happens therefore in the reader, and can be changed per reader, and at will, by using SET TRANSACTION ISOLATION LEVEL — for each reader.
What that changes is just how much each reader dives into the undo log. At READ UNCOMMITTED, it does not. It will go to the ibd tablespace file and give you what is there. At READ COMMITTED, for rows that are locked it will instead go one level deep into the undo log and give you the first value it finds.
With REPEATABLE READ, the default, it will give you the newest version from that rows undo log history that is older than the transaction the READER is currently in. So when, as a READER you START TRANSACTION READ ONLY, you stop the world at your current txn#. You then only get rows that are younger (or equal) that that txn#, from any table. They come from the undo log.
That also means that running START TRANSACTION READ ONLY stops the purge thread, which removes old versions of rows from the undo log. If you do that for half an hour, bad things happen to system performance.