The IT industry is a change machine. It is full of change. It always has been. As I get older, I wonder if it really is changing more, or if I’m just becoming set in my ways.
DevOps
The first time I heard the term “DevOps” many years ago, I thought of it as a ploy for developers to get access directly to production. At the time, there was a very strict line between developers and production. A few of the top-level developers might have read-only access, but they wouldn’t have write access of any kind. There were very separate groups tasked with production support.
I’ve certainly learned more since then. I think my favorite description of DevOps comes from a basic conversation on containerization, origins unknown:
Ops: Production doesn’t work after we deployed your code!
Dev: Well, it worked perfectly fine on my machine!
Manager: Ok, let’s just deploy your machine.
While containerization is not entirely necessary to DevOps (particularly at the small end), many DevOps implementations include containerization. It makes many of the goals of DevOps more easily achievable on a large scale.
Defining DevOps
Every source I pull up has a slightly different definition of DevOps. The general idea is to apply an agile methodology to both infrastructure and code to come up with solutions that can go to market very quickly and improve iteratively with the iterations being much shorter in duration. It involves infrastructure and developers working very closely together to achieve these goals. In the process of doing this, it brings developers a closer involvement with production in many ways. I’m not going to claim enough expertise on DevOps to offer my own complete definition.
My Experience with DevOps
As someone with a career in tech, starting from a traditional SDLC standpoint, there are a lot of problems with the ideas of developers working in prod and shortened release cycles in their simplest forms, but there’s a kernel of brilliance in it at the same time.
I was on an early project at IBM intended to offer companies the computing power they needed, and only what they needed with the flexibility to change hardware quickly and easily. It was called the On-Demand Services Center or ODCS. This was around 2005 or so, and before Amazon even got into the cloud business. Unfortunately, the technology wasn’t quite there yet to support the idea (or IBM wasn’t employing the bleeding edge at the time), and on at least one client, IBM lost money hand-over-fist on it.
In my limited experience, DevOps involves developers having a local environment that is sufficiently production-like to test changes as they’re making them. Their changes are then integrated into an “integration” environment using a source control tool such as GitHub. The integrated changes are then moved through whatever test environments you may have (QA, load testing, etc), with automated testing at each step. Once the change has gone through testing, it should be able to be deployed to production at any time, without an outage. In the holy grail of DevOps, changes are small and being continuously rolled into production, even in the middle of peak traffic or usage. Even if something fails, rollback should be quick – one motto I hear a lot is “fail fast”. That primarily means we should catch it in a test in lower environments, but at a deeper level, it means to try stuff and if it fails, then we know that and try another way.
In reality, many companies still group smaller changes together to do releases or deploys. These releases or deploys are just far more often than they used to be. But speed is a key feature of DevOps. If you have a change control process that takes weeks or months to move through, that’s probably not going mesh so well with DevOps. If change control is necessary, then they should bless the DevOps and testing processes, not every individual change.
From the teams I’ve seen, speed is a key sticking point. Put a traditional team and a DevOps/agile team on the same project playing different roles, and you will inevitably see the DevOps team lamenting how the traditional team is slow and a roadblock, and the traditional team lamenting how many mistakes the DevOps team makes and how break-neck their speed is.
What this Means for the DBA
DBAs have traditionally been the gatekeepers for the data – charged with data availability, data quality, and speed of access to data. This has necessarily put us into a gatekeeper role. Sometimes changes are introduced that breach the DBA’s goals and we have to say no or help people find a better way to do something. We’re therefore used to a slower, more traditional approach. We also tend to be a little on the control-freak side, having had developer changes wake us up in the middle of the night or ruin our nights and weekends many times over the years.
Because of this, we sometimes have a resistance to change. If I do it the old way, I’ll never have those life interruptions, or worse, have the company loosing money because the site/app isn’t working.
We are used to running some of the biggest servers in the enterprise, and if we do it well, people see us only as that roadblock. If we do it poorly, the database gets blamed for every outage first, no matter how unlikely.
DevOps requires us to use all new tools, to expose our databases to different levels of access, and in some cases to fundamentally change the way we do things. This isn’t bad, but it does require an adjustment to still be able to apply the same standards of excellence to a new world.
DevOps is a learning curve for the traditional DBA. In most of my DBA roles, I’ve worked with minimally a system administration team and an application team. Sometimes those roles are split many times over and served by an array of teams. This has allowed me to focus on the database management system while learning a bit of the various operating systems and applications.
DevOps Tools for the DBA
One of the big things I’ve noticed about DevOps is that it really forces me to go more wide than deep. For example, instead of just installing Db2 on a Linux server that someone else built, I’m now expected to build and maintain a docker container that includes the database, and can handle persistent storage properly. SSH to the server? Pssh, that’s to be avoided. Cron? Forget it, not happening.
This requires learning a new suite of tools. I’ll likely write more detailed entries on many of these, but wanted to provide an overview of some of them.
Docker/Kubernetes/Orchestrator
Welcome to DevOps. Your entire world is now defined through text files.
Docker Containers
Database servers (except for maybe the most important, largest production servers) are now docker containers. No, there’s no good tool to build a server and turn it into a docker container. Logically things don’t quite work that way. While you can treat containers a little like VMs, there are key differences.
There are two main files at this level – the docker file and the entrypoint script (likely a shell script). These two files, pull a container, make some modifications to it, and spin it up whenever a new database server is needed. It is possible to have to have a common container and then have specialized ones with modifications for each stack. An internal repository for containers is also important.
Here’s something you’ll likely learn the hard way: Don’t change the configuration of a container. Ever. If you want to change the configuration, it must be done either through a change to your docker container or through a configuration management tool. If you don’t do this, the next time the container is deployed or re-deployed (likely to be weekly or more frequently), your configuration change will disappear. Don’t get lazy about this.
In addition to having functioning database containers, there should be a way to keep a copy of a container with up-to-date data for developers to spin up. They no longer have local copies of environments, just a container to spin up. This is certainly worth doing right, because any more than two or three developers and this is an area that will really save time.
A word on persistence. Containers in many of their incarnations over the years have seemed a fabulous idea for applications where there are several to hundreds of identical servers that can be spun up or taken down without losing data. The problem of persistent data has kept databases out of this world for a long time, and to some extent rightly so. However, we can now create our databases on persistent storage, and the docker container the RDBMS runs on can still be ephemeral. This makes RDBMS fix packs and upgrades ridiculously easy and fast.
Kubernetes/Orchestrator
Any containers that are not developer locals may be spun up within a Kubernetes infrastructure using an orchestrator such as Rancher or OpenShift. This supplies functionality for making sure containers are working properly and takes defined actions if they’re not, pools resources to be used for the containers, stores the passwords for application ids in a secure way, and so on. Helm charts are generally used to define the resources needed for a container – yet another text file. There may be layers of helm charts if a database is integrated with containers related to an application (likely).
Configuration Management
It is possible to handle configuration management solely by using the Dockerfile and entrypoint script. But especially if there is a mix of both containerized and non-containerized databases to handle, a configuration management and automation tool can come in handy. I think there’s a fair amount of competition in this space – such as Puppet, Ansible, or Chef. Use this tool to define and enforce configuration – from RDBMS configuration parameters to users/groups or even making sure specific filesystem exist or specific software is installed.
Source/Version Control
There are a million text files to manage. Dockerfiles, entrypoint scripts, helm charts, puppet configuration, etc. These all should be checked into something like GitHub to manage changes to them. Even if you’re a lone-dba and not working as part of a DBA team, you need to define a single source of truth.
Where it gets weird is in source control for the database. Maybe some RDBMSes offer this – Db2 does not. A tool such as Liquibase fits in this space. The primary goal is to manage changes to database objects, via having all database-changing code run through the tool. This makes it so developers can apply and roll back database changes, potentially without DBA involvement. We have an approval process when the code for this is checked into GitHub, but there are also some ways of automating part of that approval process.
Job Scheduling
Are you a cron-ninja? Well, that’s not allowed. All jobs should be scheduled centrally. Jenkins is a tool we use for this. Even our database maintenance, like backups, now spin up a container, run a script, and then that container goes away. We have Jenkins jobs whose only job is to create the other Jenkins jobs. When we add a new supported database environment, all we have to do to set up our standard 5 maintenance jobs is to change a few lines in a text file, and then execute a Jenkins job to rebuild the other Jenkins jobs.
Painful Learning Curve
DevOps has a painful learning curve for the DBA. I have spent a year or more learning how to do things a new way, with the benefits of it being only slowly revealed. I recently created a new maintenance job. I spent about 3 hours doing the in-database work of building a table and a stored procedure and a script to use them. I then spent more than 30 hours learning how to use Liquibase to deploy that to multiple databases and learning how to use Jenkins to properly schedule it. I lived in Jenkins files for three or four solid days. The magic of the final deploy – clicking a button and it’s done – was fun, and it’ll be so much faster the next time I have to do it.
Scale
I really see the advantage of the DevOps approach, combined with flexible cloud hosting. It makes agile methodologies possible, and even exponential growth manageable. I do NOT see the advantage at the smaller end. Learning and combining all of these tools is complicated. For 4 databases, it is not worth it. For 400, it sure is. I am not sure where the line really is. It is also possible that one area of your organization is large enough to really see the benefits while another is not.
Future-Looking
IT is currently going in the direction of DevOps. From what I’ve seen in my career, many companies and methodologies seem to swing back and forth between centralized and decentralized. While I see the Agile(DevOps)/SDLC swing as similar, I’m not sure we’re going to see a swing back to SDLC in my lifetime or ever. Those of us who embrace the new methodologies and make our place in them will continue to grow. Those who do not will sunset with the technologies they support. There’s nothing really wrong with either way – legacy methods of working will continue to have their place for decades to come.
I have spent a year or more just trying to learn the definition of DevOps, I’m sad to say. However, your explanation lit the light bulb above my head. Thanks for that!
PS I work with SQL Server, but have followed you since my mainframe Db2 days.
This may be my favorite comment ever. Thank you!
I like to joke that there are at least as many definitions of DevOps as there are practitioners of it, but it probably isn’t too far from the truth. I’ve settled on the definition “the application of many of the best practices of development (thorough testing, materializing changes as code, source control + code reviews, emphasizing quality, etc.) to infrastructure and the environment that runs our applications”. It really is an expansion — albeit highly specialized — of the domain of agile methodologies and development. As you mentioned, we’re distilling the desired state of an ever-expanding slice of the world the runs our sites into mere text files. It can be intimidating and, at least in principle, could have been something we did all along. But it just so happens that the idea or mindset of DevOps has attracted the right sorts of attention — acting as something of a bug light for tools and technologies. Each technology is powerful and interesting, but together, under the flag of DevOps, they’re really achieving incredible things. I applaud your continued efforts in this area and it is exciting to see a highly regarded DBA not only acknowledging, but sprinting full speed toward the light at the end of the tunnel. I’m thrilled that we get to change the (DBA) world together 🙂
Having a foot in both worlds for a couple years, this really helped me get a better understanding. My current shop, is heading towards database agnosticism. They just need somewhere to persist data, it could be one of the various AWS RDS offerings, or serverless like DynamoDB. While your blog added a ton on context, I still don’t know how I will need to evolve from a DBA for On-prem or EC2 managed, to RDS and server less. Our DevOps team does the coding and pipelines and hence manage the files you speak of. Do you have any insight on how a DBA contributes when the emphasis is a database that is provided as a service? What skills do I need to acquire to be able to provide DevOps team value?
Being familiar with RDS and how databases work within it is the first skill. Properly configuring cloudwatch to monitor/alert on databases, understanding what maintenance is or is not needed on RDS and what different way you may need to accomplish that is important. You still need to know and practice what restores look like and how high availability works. Understand the SLA for the services being used – does it match what is needed out of a database?
There are a number of places to still add values. Can you still identify poor-performing SQL? Can you make suggestions to improve the performance of that SQL? Can you identify when an instance is oversized or undersized? I find myself spending more time on the logical DBA side for RDS or DBAAS databases. Things like using liquibase to manage SQL code and data models or dbt can also be useful.
The problem with not having a DBA involved is that there are details missed. The only performance solution becomes to throw more hardware at a problem, and while that’s less expensive on a cloud than when we had to order million dollar servers, it still adds up over time.
The role of the DBA does often change, but that’s not necessarily a bad thing. Some of these things are fun, and you already have the platform of understanding to learn them. Hope something there helps.
hi Ember, thank you so much for taking the efforts of writing this blog. I am a traditional MSSQL production DBA trying to switch to Azure world slowly. To be honest, I was swamped by so many
terms and sometimes I feel that my brain can not function. Is there any tutorial on how to map the DBA DevOps tasks to my daily DBA Routines? for example, taking backup and restore, configure SQL jobs, making changes at object/schema/server levels. These are typical tasks I am doing currently. If I am going to switch to Azure DevOps world, what steps should I perform and which tool should I use? is there any Azure DevOps 101 I can take?
thanks alot
Hui
I wish I knew the Azure world well enough to tell you. One thing in a devops world is going to be making sure that all maintenance is done really as code that is checked in somewhere. So whatever tools you have for running jobs (Jenkins? Something else?), having jobs for that which can be checked in to a repo. I don’t know what the Azure options are in that space. I might need to write a translation post like that, at least for my areas of experience.