The Big Mistake IBM is Making with Db2 in Containers

Posted by

Note: the statements in this entry are my opinions. I quote IBM once in this article, and that statement should be seen as the only pure fact here. I invite IBM and others to add their perspective in the comments below.

I see IBM making a mistake with how they’re handling Db2 in containers. I’ve pushed behind the scenes to try to get them to correct this, but my pushing is not having an effect at this point.

The Beauty of db2u

I first learned about db2u in the fall of 2019. I don’t know if it was publicly available yet at that point, but I was at an IBM Gold Consultant summit at the IBM Toronto lab. One part of that meeting was a “science fair” of different aspects developers were working on for Db2. I learned things I can’t share, things I couldn’t share at the time, but can share now, and some things about products that already existed and were public at the time. There were maybe 10 or 15 different tables in a room, all talking about different things. It really was one of my favorite parts of the summit. I remember about three of them really well now, 1.5 years later, and the clearest in my mind is db2u.

I fully believe db2u could be a huge strategic advantage for IBM. In a world where the traditional enterprise RDBMS vendors are struggling to remain relevant and and part of the discussion for many new projects, db2u is a stand-out. db2u, conceptually, is a nearly cloud-native way of doing a relational database management system. While “cloud-native” is a buzzword, it’s one that many organizations are demanding these days. Let’s break it down a little.

Cloud-Native

Where I work, we score each database platform on several aspects to determine how “cloud-native” it is. We pull this process from the same process used to evaluate potential applications. We also throw in a few of our own evaluations at the end related to the platform we use. Let me go over these briefly.

Multiple Components

What is presented to the end-user as a single application is actually delivered as a set of co-operating services. For databases, the “end-user” is often actually the application. Part of the idea here is also that the different components can be independently scaled based on requirements and load. Traditional Db2 doesn’t do well on this item, as there’s a single large install and any components are highly interdependent instead of somewhat independent.

Loosely Coupled Microservices

Microservices composing the solutions should be independently deployable and replaceable. Services should communicated with other services at runtime, dynamically. Again, Traditional Db2 doesn’t do well on this item.

Elastic

Scale up or down independently in an automated fashion. Db2 would allow online scale-up of storage, cores, and to a limited extent memory, but the full meaning here is often adding or removing nodes. Perhaps PureScale would meet this requirement (though If I recall correctly, scale down may still require an outage), but I can’t actually run PureScale on most clouds.

Responsive to Business Changes

Can be updated and deployed frequently and independently with zero downtime. While Traditional DB2 with HADR can allow patching with minimal down time (just the time for a failover), I still cannot update a version level without a significant outage. This is even true for PureScale as I understand it. Also, Db2 patches themselves are often still quite a long ways apart.

Resilient

Run reliably, securely, and predictably in spite of transient issues in the cloud including network, capacity, and varying loads. Traditional Db2 does get points here. I have been repeatedly shocked over the years by the type of failures Db2 using HADR can survive and remain up. However, until November 2020, there was no supported solution for automating failover in the cloud. TSAMP largely did not work for this. I haven’t had the opportunity to upgrade the OS and Db2 since November to try out the new Pacemaker/Corosync solution for this

Composable

Uniform and discoverable API – designed to be a part of other applications. I would argue that SQL gives any RDBMS some points here. No one runs a database for long without applications that are often designed to go with it. Nearly every language has interfaces for working with Db2, and while we can argue about the quality of particular drivers, Db2 has working drivers for the most popular programming languages.

Infrastructure Agnostic

Free to move as required – not connected to infrastructure constraints. This one drives me nuts because I would argue that very little in infrastructure or applications actually meets it. Nearly everything has a limited subset of platforms on which it works, the question is just how limited that list is.

Built on Open Standards

Extensively leverages open source components and community support. There is an interesting dichotomy here between the old-school requirement for vendor support and the ability to just fix things yourself that open source offers. I’ve seen struggles with other vendor-supplied software where technicians I work with knew exactly how to fix a specific problem they were encountering with enterprise software, yet even providing a vendor with details and code snippets for parts of code they could see, it took months or years for the fixes to work their way into the software.

Available as-a-service

Db2 kind of checks this box because it’s available on the IBM cloud. But if you’re not willing to use the IBM cloud, you’re out of luck.

Available as-a-service on AWS

At one time, IBM announced they were offering Db2 as a service in AWS. However, there was no “click to buy”. When I reached out and tried to get a POC for an OLTP application, they were wholly unable to meet the very reasonable SLAs we were looking for.

Can run on AWS EC2

This is our last-ditch at this point. If we can’t consume it as a service or run it on EC2, we won’t be using it. Db2 can at least run on AWS EC2, though until November there was no officially supported solution that worked for automating failover.

As I understand it, db2u is the cloud-native answer to the gaps above. My understanding is that it is composed of multiple containers for multiple purposes, which can be independently managed, allowing it to be more cloud-native than other Enterprise RDBMSes out there. The release cycle is also shorter than traditional db2 for frequent fixes and updates.

In addition to this, there’s a nifty new tool IBM is offering called Click-to-Containerize. While I haven’t had a chance to use it yet, it really does sound like an excellent way to move data into a container.

The Disconnect

db2u sounds amazing to me. All the stability and proven performance of db2, but embracing new methodologies and taking advantage of all they have to offer. The problem I run into is that db2u is only available on RedHat OpenShift. I don’t have any experience with OpenShift to say whether OpenShift is good or bad. I can say that I know some brilliant engineers who evaluated it and chose Rancher, instead, for our use case. There are some tremendous advantages of having your database containers right next to your application containers. Having a single namespace that everything is in reduces communication overhead and makes spinning up new environments that consist of both application containers and db containers very easy. Even large organizations are likely to select an orchestrator for Kubernetes and stick with it. Developing significant skills in more than one Kubernetes orchestrator is not very likely.

What this means for me is that I don’t get to use db2u, while at the same time needing to containerize all that I can. Consequentially, I’m pressured not to use Db2 and choose database management platforms that score higher on our cloud-native index whenever possible.

I see requiring OpenShift as a big mistake that IBM is making for Db2, and one that they’ve made before. When the cloud was a new thing, IBM refused to offer Db2 on non-IBM clouds, outside of a bring-your-own-license model. IBM has chosen not to offer Db2 on AWS RDS despite repeated pressure from influencers and customers, and this has severely crippled developers’ use of Db2 and exposure to it. IBM has chosen not to offer a reasonable managed Db2 option on other clouds, which is a direction companies like MongoDB have chosen to go.

They’re repeating this mistake again with OpenShift. IBM supports Db2 on fewer operating systems than when I started as a DBA. In some respects the operating system has become a slightly less important layer when compared with the virtualization or containerization platform. This leads me to dismiss the excuse that IBM wants to support fewer platforms rather than more to reduce complexity and duration of testing and therefore build more faster and better. Yes, less complexity is nice, but it also reduces potential market share.

Running Production Db2 in Containers

To compound on this mistake, the only place where Db2 in containers is fully supported is on OpenShift. Let me say that again. If you run Db2 in containers and it’s not on OpenShift, IBM will generally not support your Db2 implementation. This is considered an unsupported platform. I suspected this was the case, so I reached out to some IBM contacts and the statement on this that I got was:

You are correct that IBM position is that Db2 in containers is fully supported and certified on RedHat OpenShift.

IBM Db2 Support will accept Support cases for Db2 running in custom containers or open source Kubernetes distributions only if error can be reproduced on Db2 running on Redhat OpenShift, VM or baremetal server running on supported OS and Virtualization infrastructure

So not only do I not get the cool stuff associated with db2u, but without OpenShift, there is no supported way to run Db2 in containers in production. I can’t even build my own container and run Db2 in it for production under the licensing terms. Technically this means that Docker and/or Kubernetes alone are unsupported platforms. I don’t see how IBM can only support Db2 on one type of Kubenetes implementation and claim that means they support Kubernetes.

Summary

Don’t get me wrong, I get the dilemma. I’ve discussed with Db2 developers the absolutely massive set of tests that IBM has to run for the vast array of OS and virtualization platforms that they do support for Db2. I get that they don’t want to pay to train their support staff to troubleshoot Db2 on “unsupported platforms” like building your own docker container. I have seen their lack of knowledge in a containerized world. The question is, if support isn’t there for where people want to run Db2, why not just go with another RDBMS? That’s the direction I see small and mid-sized companies going. Away from enterprise RDBMSes and towards the RDBMSes that will support where they want to be.

If this affects you and/or you agree, please go vote for the AHA Idea to show IBM it’s not just a couple of us.

Edited after publication to add link to the RFE/AHA Idea.

Lead Database Administrator
Ember is always curious and thrives on change. Working in IT provides a lot of that change, but after 18 years developing a top-level expertise on Db2 for mid-range servers and more than 7 years blogging about it, Ember is hungry for new challenges and looks to expand her skill set to the Data Engineering role for Data Science. With in-depth SQL and RDBMS knowledge, Ember shares both posts about her core skill set and her journey into Data Science. Ember lives in Denver and work from home

10 comments

  1. 100% agreed many of the enterprises facing same issue to adapt db2 as cloud native database because of it’s only supported in Redhat Openshift.

  2. I also agree IBM Execs most of the time took and still take bad decisions when it comes to make Db2 stand out versus competition. So much good effort/money in developing Db2 features but no good job done by IBM marketing/Execs to profit from it.
    As a side note: Db2 is production-supported on Azure for SAP apps with Pacemaker. I support those Db2’s.

  3. Ember, I think you’ve hit it right on the head. As a company we’re migrating away from DB2 because there’s no cost effective way to run DB2 in AWS. As a result we’re moving to RDS Aurora Postgres.

  4. Rightly said, it was not the case before IBM acquired RedHat. I remember in 2017 I tried then IBM product called IBM Cloud Private for data (Now IBM Cloud Pak for data) was using vanilla Kubernetes, its only after acquisition of RedHat, they started migrating everything to OpenShift.

  5. A whole lot of change has happened from the time IBM Cloud Private disappeared and OpenShift showed up with Cloud Paks as a primary vehicle. However, one can still set up DB2’s community edition container via Docker Hub as a Statefulset on any flavor of K8s. If there’s a difference between the standard container and the db2u container, then I’m sure db2u Docker Hub image could also be set up the same way. This obviously does not answer the official support question, so you are correct. OpenShift is not bad of a K8s for production. What IBM has done with Cloud Pak for Data (CP4D) with DB2 Warehouse for production, is very interesting. Looking forward to your blog on CP4D.

    1. db2u and the community edition container are fundamentally fairly different. I absolutely run db2 in my own containers for non-production, but I don’t get the db2u cloud-native magic. I wish I had the CP4D experience to be able to write about it.

      1. Hi Ember, CP4D is also a black-box for me (and even some IBMers who are supposed to work with it) so what if you contact ‘A’ above and collaborate to create an article about CP4D since he seems to have better understanding than everyone else. Best regards

    1. Currently it is still an issue. At the IDUG North American Technical conference, IBMers stated they intended to make some changes in this area. I’m hoping there will be more official announcements on that at the IDUG EMEA Technical conference in December.

Leave a Reply to Srinivasa Rao Suryadevara Cancel reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.