AWS S3 is a cloud object storage service that is easy to use. It is highly available and offers ELEVEN nine’s of durability.
A Few Backup and Recovery Best Practices
I have quite a few blog entries over the years on backup and recovery. A few things to remember for this article:
- Even with High Availability plans, a backup and recovery strategy is required
- Even with Disaster Recovery plans, a backup and recovery strategy is required
- A backup and recovery strategy doesn’t exist unless it is regularly tested
- Externalize backups, archived transaction logs, and other critical files quickly and often
A Brief Summary of My Experience with S3
One thing I noticed when I was reviewing other articles about backing up Db2 databases to S3 is that most were fairly tailored to the situation of particular individual experiences with it. This is no surprise, and is true of this article also. My experience comes from db2 databases running in custom-built EC2 instances. The EC2 instances generally have IAM roles that allow them to access S3 without explicit authentication. I’m also focusing on Backup/Restore here, and not some of the other interesting ways Db2 can interact with S3 like load copy files or external tables.
Basics of S3
S3 is an AWS service that provides cloud object storage. S3 consists of “buckets”, which have unique names across all of AWS. These buckets can be either public or private and either encrypted or unencrypted. Access to files on S3 is generally free from things within AWS, but is charged for from outside AWS. There is also a charge based on the amount of space used. Using S3 is extremely simple after the first few times. Interface options include the web interface within the AWS console and the S3 commands that are a part of the AWS cli.
S3 buckets can be private or public, unencrypted or encrypted a couple of different ways, and can also be versioned to deal with different versions of the same file (costs extra, and doesn’t make sense for Db2 backups). Db2 backup will only work with private S3 buckets.
Reasons to Backup to S3
One of the primary reasons that I want to back Db2 databases up to S3 is that I would like to externalize backups for the Db2 instances I run in EC2 instances. Yes, sending a backup to another EC2 instance is an option, as is copying backup images after they are complete from local disk to S3 or another location. I became a DBA in a world where I had to know the syntax to db2adutl, and I’ve evaluated the use of a number of vendor backup solutions for clients over the years. It is easiest and most foolproof if a backup immediately and automatically externalized when the backup image is created.
Some may ask why to even externalize a backup. It is true that many AWS services, like EC2, have rather high uptime records, even if the SLAs don’t always match those records. But it is always possible for a service to fail and to do so spectacularly. If I have my data in two different places, the chances of both of those places failing at a similar time is greatly reduced over the risk of one failure. The type of EBS (virtual disk) we use has a durability of 99.999%. S3 has a durability of 99.99999999%. While 5 9’s is pretty good, it seems to me that I wouldn’t want to put all my eggs (data) in that basket when the basket with 11 9’s is available to me.
The AWS SLA provides a (10%!)credit if EBS volumes (the most common kind of storage for EC2 instances) or EC2 instances themselves are not available 99.9% of the time. Even something that has a very low track record of actual data loss should not be allowed to be a single point of failure. One way of eliminating this is by using HADR to another instance in another availability zone or region. We do that for production. However, the overly-cautious DBA in me would also like to have backups externalized from the server quickly. S3 gives us this. We don’t know if S3 shares resources with EBS when in the same region, but getting a copy of the data off the server is one form of being ready for Disaster Recovery.
Please note all SLA, durability, and cost details should be verified with AWS. They may change over time.
The set of steps needed to take a backup directly to S3 includes:
- Set up the PKCS encrypted store
- Create or identify the S3 bucket and path
- Update the DBM configuration with the KEYSTORE_LOCATION and KEYSTORE_TYPE parameters
- Catalog the storage access alias
- Update the Db2 Registry parameter DB2_OBJECT_STORAGE_LOCAL_STAGING_PATH
- Stop and start the Db2 instance
- Test BACKUP and RESTORE to make sure they work
Set up the PKCS encrypted store
This is a fairly similar to setting up Native Encryption or SSL authentication. A local keystore is needed to store the key and secret key to access S3. The first step is to create a directory for that and create that keystore, with these fairly simple commands:
mkdir $HOME/.keystore gsk8capicmd_64 -keydb -create -db "<$HOME>/.keystore/db2-keystore.p12” -pw "replace_with_new_unique_pw" -type pkcs12 -stash
Any location works here –
$HOME/.keystore is just an example.
The password must be stored in a reliable and secure place like a separate encrypted pasword vault. The keystore should be regularly backed up. While it’s not the end of the world to lose this keystore in this use case, it’s a real pain in other use cases like SSL authentication or Native Encryption – to the point of potential data loss.
Update the DBM configuration with the KEYSTORE_LOCATION and KEYSTORE_TYPE parameters
This is just letting Db2 know where the keystore is:
db2 "update dbm cfg using keystore_location <$HOME>/.keystore/db2-keystore.p12 keystore_type pkcs12"
<$HOME>/.keystore with the value of the directory for the keystore.
Create or identify the S3 bucket and path
The S3 bucket must be a private one. It should probably also be an encrypted one. The number of files here may be larger than you expect, because Db2 splits the backup file into 5 GB chunks. This means that it probably makes sense, if managing backups for multiple databases, to have a nice folder structure to isolate backups for different databases.
Use either the AWS console or the AWS cli to create a bucket if one does not already exist. Bucket name must be globally unique (across all customers), so something like ‘backup’ is not going to work. There are naming standards to follow.
Catalog the storage access alias
Once the three steps above are complete, catalog the storage access alias. This is required for a backup to an S3 bucket. Here’s my preferred syntax for this:
db2 "catalog storage access alias my_bkup vendor S3 server s3.
.amazonaws.com user 'access-key' password 'secret-key' container 'example-bucket' object 'path/within/bucket'"
There are a number of things to detail here. First, my_bkup is a local name you choose for this location. I like to use something fairly generic so I can have a backup script just specify that generic alias, and then any customization on the path is done here within the catalog command and not in the backup script. This is a design choice that can go either way. If you eliminate
object 'path/within/bucket', the storage access alias will only point to the bucket. The backup command can be used to specify paths within the bucket instead.
Next, change the region in the url to use S3 in the same region as the EC2 instances or other Amazon resources in use. If you’re not using EC2 instances or other amazon resources, then specify whatever region makes the most sense for you. S3 manages cross-availability-zone for all files.
The user and password specified are the amazon access key and secret access key. These are easy to generate if you do not already have them. An IAM role for the server CANNOT be used by Db2. I’ve opened an AHA idea to encourage IBM to fix this. Please go vote for it!
container keyword, specify the name of the bucket. This is the last required parameter, and the command can stop there. It is also possible to include a path within the bucket as the base location. Use the
object keyword and specify the path relative to the bucket. The values above would equate to an S3 URI of
Update the Db2 Registry parameter DB2_OBJECT_STORAGE_LOCAL_STAGING_PATH
The Db2 registry parameter DB2_OBJECT_STORAGE_LOCAL_STAGING_PATH specifies the local staging path used by Db2. When taking a backup or other actions that generate content, the content is written first here and then transferred up to S3. At least for backups, the backup is automatically broken up into files that are 5 GB in size. This means for my 60 GB backup (size after compression), I get a list of files that looks like this:
This format should be familiar to anyone who has taken a backup to multiple locations.
The default location for DB2_OBJECT_STORAGE_LOCAL_STAGING_PATH is
Stop and start the Db2 instance
The change to DB2_OBJECT_STORAGE_LOCAL_STAGING_PATH requires an instance stop/start to take effect. If the default works for the server, then the stop/start is not needed.
Test BACKUP and RESTORE to make sure they work
Testing recovery strategies and changes to them is always a good idea, but this one is particularly important. The syntax for backup looks like this:
db2 backup db sample to DB2REMOTE://my_bkup compress without prompting
sample with the database name and
my_bkup with the name of the storage access alias cataloged above.
To place this backup in a subdirectory of what was cataloged, specify it with syntax like this:
db2 backup db sample to DB2REMOTE://my_bkup//subdirectory/path compress without prompting
Notice the double slash after the bucket name and before the subdirectory. Backups can only go to a subdirectory of what was cataloged in the storage access alias. Consider carefully where any granularity makes sense for each environment.
Potential Problems Backing up to S3
EC2 servers can have their own IAM role within AWS. This means that they can access S3 and other AWS resources without having to specify a key and secret key. However, Db2 forces us to enter a key and secret key. We cannot use that IAM role when backing Db2 up to S3. I've verified with IBM that this is working as designed and not some mistake I've made. This is somewhat frustrating when we have a secure access policy that works for authentication, and yet we still have to use another method here and define shared IDs which we wouldn't normally use.
One thing I've seen with vendor solutions over the years is that sometimes putting network in the middle of a backup makes it unreasonably slow. I don't have comprehensive and wide ranging speed tests, but I thought I'd share what I saw. Remember that I am going from EC2 instances to S3 in the same region. Cross-region might be slower and coming in from outside of AWS to S3 would almost certainly be slower.
For my imperfectly parallelized ~300 GB uncompressed db (compressed db backup about 60 GB in size) that took about 47 minutes to back up to "local disk"(EBS), backing up to S3 took 58 minutes. That's about a 20% increase in backup time. I also compared the timing to both backing up to disk and then transferring the image to S3, and it was only about 10% slower. The really good news was that for restore, my test showed 35 minutes when restoring from local disk and only 38 minutes when restoring from S3. I care more about the restore time than the backup time for this database.
I read the documentation a bit too diligently ahead of time, and noticed this tidbit:
Because of this, I had it stuck in my head that I would have to figure out how many 5 GB chunks to split my backup image into, and then specify the location that many times. Luckily, Db2 takes care of this piece. What it does is build 5 GB files locally, and then move those to S3 one by one. I observed this behavior when looking at what was happening on my local filesystem:
Then when I look at my S3 bucket, I can see the multiple files, named as I would expect them to be:
One of the disadvantages of this is that there is a database history entry for each one of those files. Also, restoring somewhere else by just copying the files over requires copying and managing all of those files.
Because of the file size issue, Db2 needs a local place to build these 5GB files. Details on changing this are above - it's just best to be aware that this exists and how to change it.
For us, backing up to S3 is a go. After this initial POC, I'll be moving forward with implementing this for our databases. I like the externalization and the general ease of use.
I wanted to share a few other articles I referenced on this while working through it.
- IBM blog from Henrik Loeser
- Triton blog from Damir Wilder
- IDUG blog from Philip Nelson - particularly thorough
- Data Henrik blog from Henrik Loeser