Db2: On-Prem vs. Amazon RDS

This blog entry compliments a webinar on February 8th where I will be a panelist answering questions on Amazons RDS for Db2. Register for “Introducing Amazon RDS for Db2”now.


I was lucky enough to be invited to speak at IDUG Australia a few years ago. It was a wonderful experience and an item I could check off my bucket list. I remember talking to my wife about the trip when I retuned home and I mentioned the difference in everyday expressions were just enough to consistently trip me up through the first few days. A “long black” was coffee, “car parks” were parking garages, and don’t even get me started on the biscuit, cookie, and scone fiasco.

I had the same impression when working with Amazon’s RDS for Db2. It was Db2, no doubt, but sometimes I had to interface a little differently than I was used to. In some cases, I had to think about a common concept in a different perspective. Just like when I engaged in casual conversation in Australia, I sometimes tweaked my approach with RDS for Db2.

As I worked through an in depth eight-week beta test last year, I started to note down some of the oddities that I wasn’t expecting. I thought I would share that with my fellow database engineers out there.

So, without a further ado, let me introduce a few of those differences in no particular order.

RDS for Db2: Physical Design

You will discover that the overall physical design of a RDS for Db2 instance (virtual server) is very similar to a on-premises database server that utilizes a storage area network (SAN). Nothing should be surprising here. However, I want to caution you when choosing the storage type, IOPS, and storage throughput. Database administrators are painfully aware of the disk IOPS and its impact on the speed which data returns for a query. However, a database administrator may not be as aware of the throughput connection between disk and virtual server. It is entirely possible to have fantastic disk IOPS but slow down query performance as a bottleneck forms when data comes back from disk to virtual server. To remedy this, you may want to consider GP3 Elastic Block Storage (EBS) as your storage type. You have two types of storage types available, IO1 and GP3. Both are general purpose, but GP3 allows you provision performance independent of storage capacity (IO1).

RDS for Db2: Logical Design

The creation of an additional RDSADMIN database will be the biggest surprise for those not familiar with any Amazon RDS product. RDSADMIN is an administrative database used for the management and maintenance of your newly created database. This database houses Amazon specific stored procedures and functions that are normally executed by a superuser. Tasks like buffer pool, table space, and table creation as well as changes to database and database manager configuration happen against the RDSADMIN database.  Remembering when to use the RDSADMIN database is a little troublesome at first, but you quickly get the hang of when to utilize it.

Connectivity

One of the biggest adjustments for me was losing the ability to Secure Shell (SSH) to my target environment. Normally, anytime I query data I connect to a target database from my local via a tool like DBeaver or Toad. If I needed to run an administrative command of some sort (i.e. “force applications”, “restore database”, etc.) or use a tool (i.e. db2top, db2pd, etc.) I would login to the server itself. You lose that SSH ability with an RDS database.

The good news is that most of those commands are still possible with an explicit connection to the database or an attachment to the instance. If those are not an option, there is likely an Amazon stored procedure or function that you can use.

Resources:

Connection via 3rd Party Tools (i.e. DBeaver)

When an RDS for Db2 instance is created, the Db2 authentication method is set to SERVER_ENCRYPT. As a result, you will need to set two driver properties within your 3rd party tooling. In DBeaver, for example, you would add the following settings in the Driver Properties tab.

  • encryptionAlgorithm 2
  • securityMechanism 7

Resource:

Backup & Recovery

The approach to backup and recovery is going to be different than what you are used to. The traditional Db2 offline, online, and differential backups are no longer supported features. These are replaced by Amazon’s database snapshot held in an S3 bucket (object storage).

The good news is that recovery from a snapshot is very easy. The process is as simple as finding a snapshot backup of your database from the management console and selecting “restore”.

The only catch with restoring from snapshot is the ability to restore in place. The restore action spins up a whole new RDS instance from your backup. The process does not blow away data in the existing instance and restore data that was originally there. In some ways this is a good thing as I can continue to troubleshoot against the “broken” database while a new one is created. However, it can also be a hassle when you must point an application (or DNS entry) to a new target endpoint.

Resource:

High Availability and Disaster Recovery

Responsibility for high availability and disaster recovery shift from features like HADR and Corosync Pacemaker to Amazon’s Multiple Availability Zone (Multi-AZ).

Instead of opening ports for HADR and configuring database configuration parameters, a single radio button for Multi-AZ is selected upon database creation.  When this is selected, a second RDS for Db2 instance is built in a separate availability zone within the same region. Installation and configuration of cluster management like Corosync Pacemaker is no longer needed as Amazon will detect and manage failover of your database for you. To be honest, this is one of the biggest benefits to me. The logistics and management of backup, recovery, and high availability are much simpler.

High availability has become much easier, but the disaster recovery aspect is a little more muddied. High availability dictates the second instance to be geographically close, which is not a problem. Disaster Recovery, on the other hand, dictates the second instance be geographically distant. For example, a database in Austin, TX fails over to a disaster recovery site in Cleveland, OH thousands of miles away.

At the time this article is written, the ability to have a geographically dispersed Multi-AZ instance in another region is not available with RDS for Db2. This means, for now, your disaster recovery may be best handled by geographically separating your database and backup location.

With that said, other Amazon RDS databases do have the ability to create cross region read replicas. This leads me to believe that it is just a matter of time before Db2 receives the same feature.

Resources:

Reads on Standby

Additionally, there is no Amazon equivalent to Db2’s reads on standby feature with RDS for Db2 …. yet. Amazon does have Multi-AZ clusters with readable standbys for other RDS supported databases, it’s just not available for Db2 at this time. I can’t speak for any possible development roadmap regarding RDS for Db2, but if other supported databases have this feature, I would expect to see it for Db2 eventually.

Resource:

Encryption

Db2 native encryption is not a supported feature. That responsibility shifts to Amazon. One of the last details you provide Amazon is if you want to encrypt your RDS for Db2 database or not. If selected, an “additional layer of protection is put in place to secure data from unauthorized access  to the underlying storage”. In the end, this means your database instance, logs, backups, and snapshots are encrypted at rest.

Resource:

Database & Database Manager Configuration

I briefly mentioned this oddity when I previously explained the purpose of the RDSADMIN database. You can still view your database and database manager configuration settings as you normally would but adjusting them is another matter.

Database configuration parameters are adjusted via an Amazon stored procedure against the RDSADMIN database. The traditional UPDATE DB or UPDATE DBM CFG command is not used.

Database Manager Configuration (instance level) parameters are adjusted in a custom parameter group via the Amazon Management Console. These settings are applied to all RDS for Db2 instances that use that parameter group. This means if you adjust the default parameter group in error, those changes will apply to new RDS for Db2 instances. We will pretend I didn’t learn that the hard way during Beta testing.

Let me forewarn you, not all database and/or database manager configuration parameters are available to you. There are a ton of configuration parameters within Db2 and AWS had had to identify what parameters should or should not be accessible due to RDS design, best practices, and certification. Most parameters I would expect to see are available, most of them are adjustable, and more parameters will become visible to the end user over time. If you don’t see a configuration parameter, or  you want to adjust a configuration parameter that seems to be set to a fixed value, reach out to Amazon. I found AWS was very interested in hearing feedback here and were willing to help meet my needs.

Resources:

Storage Autoscaling

My current non-RDS databases have disk level monitoring and alerting to notify us of capacity issues over time.  We give ourselves a decent window of opportunity so a system administrator can log in and add disk capacity if needed.

With an RDS for Db2 database you have the option to include Storage Autoscaling. When a threshold or specific condition is met Amazon will automatically increase storage by predetermined amounts to a defined maximum. This was a welcome feature to me. It added a layer of additional protection for unwanted spikes of data or databases that organically grew bigger over time.

Resource:

But wait, there is more!

What I just discussed are just a few of the oddball changes and cool features that I discovered during my time as a Beta tester. We didn’t discuss topics like viewing the Db2 error log, IAM roles, Amazon monitoring features, or having the ability to create infrastructure as code via CloudFormation templates. This is for you to discover on your own. So, when you are ready, dive in and explore your own RDS for Db2 database. Start with Amazon’s latest documentation on Getting started with new Amazon RDS for Db2.

Michael Krafick
Michael Krafick

Michael Krafick is an aspiring Zamboni driver and well known twig-eater. During the day he is a Sr. Database Engineer for a Fortune 500 company. He is a frequent contributor to datageek.blog, an IBM champion, member of the DB2 Technical Advisory Board, and IDUG speaker Hall of Fame inductee. With over 18 years experience in DB2 for LUW, Mike works hard to educate and mentor others.

Articles: 34

4 Comments

  1. This all you mention simply describes another database product so whatever attachment customers could have for DB2, it can now be easily discarded and instead just pick anyone of the databases in AWS. Not sure why IBM decided to move/create DB2 cloud native if at the end the market for on-premise was already dying and now this ‘new’ cloud-native DB2 is one more of the rest…
    Don’t know about America but here in LA DB2 is dead, dead, dead.

    • I respect your right to have this opinion and share it. This is not a forum that filters out respectful anti-IBM comments. I would be interested to hear more if you have specific facts or reasons behind your final opinion there. Personally, I applaud IBM for moving more in a cloud direction with Db2, after we’ve been pushing them to do so for years, and I hope more great things come out of the IBM/Amazon partnership.

  2. this is gold-dust Mike. The RDSADMIN stuff alone would have probably taken me a couple of working days to stumble over.
    I look forward to any further insights you’d care to share
    Cheers
    Mark

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.