This year, I’m going to be working through the Advent of Code challenges in SQL and sharing my progress here on the blog. I don’t expect to complete each challenge and the blog entries on them on the same day they are released – I’m not competing on time, I’m just using this to share SQL techniques. This means my Advent of Code 2021 will surely run into 2022.
What is Advent of Code?
Do you like puzzles? Because I like puzzles and challenges, and Advent of Code is my idea of fun.
Advent of Code is an annual challenge that runs from December 1 through December 25. Each day the puzzle is released. Answers are usually numbers, so the judgement is based on how fast after puzzle release an answer is submitted. Any language or process can be used to arrive at the answer. Experienced programmers can compete and see if they can be one of the first 100 people to get the right answers and earn a place on the leader board. For mere mortals like me, the challenges are fun simply to see if I can complete them. I did many of last year’s in Python to hone my Python skills. This year, I’m choosing to do them in SQL. There is always a slightly corny story that weaves through the challenges, and some challenges are insanely easy while others are really hard. Some are not hard but are hard to do in an efficient way. You can even go back to past years and work through previous challenges.
Each problem consists of two parts. Both parts use the same data, but ask slightly different questions. Sometimes you can tweak your code for the first question to answer the second question, but this depends on how you coded the first question.
The input data for each day is different for different users, so you’re unlikely to be able to copy someone else’s answer without running it against your own input. The input data can consist of different formats, but is often numbers and strings. The “answer” is usually a number that you get by following the instructions against the full input data. You input the number on the AOC website and it will tell you whether you’re correct or not, and sometimes which direction you’re off in.
Each part usually gives you a very small set of example input data and the correct answer for that small set. This is great for testing your answer on a small set of data first, before applying it to your full input data set.
Each day, you can earn one star for each part for a total of 50 stars.
For me, often the challenges take an hour or two to work through to get to a correct, efficient, and well-documented solution, with a few taking longer than that. I’m not going for speed. This is also why I don’t plan to quite keep up with the release schedule for 2021. I have a job, and can’t spend a couple hours on a solution and a couple more on a blog entry about it every day.
From my experience last year, a lot of the challenges are about structuring data properly and navigating the structures, which is one of the strengths of SQL.
The reason I’m working in SQL this year is to show the power of SQL and give those who are learning SQL some pointers and ideas on using SQL. There are plenty of practice sites out there for SQL, but not many of them apply SQL to actual programming challenges, so I think this is an interesting take on it. I’d also like to flex my SQL muscles and look forward to some of the more challenging days.
Why IBM Db2?
Well, really Db2 because it is what I know. But more objectively, I can easily package a free Db2 Community Edition docker container in my repo and make it easy for anyone to spin up. Db2 DBAs might find this useful for learning how to use Db2 in containers. If that is too much, the same work and code can be run against a free Db2 on Cloud instance. Few enterprise RDBMSes have fully featured free options like this.
Who Might Be Interested to Follow Along?
I see two primary potential audiences who might be interested:
- Db2 DBAs who want experience using Db2 in Docker containers.
- Anyone wanting to work on their SQL skills. Surely some challenges will require more advanced techniques
I don’t claim that my solutions are perfect nor that they’re the most efficient possible. I’d love to see criticisms and discussion in the comments on these blog articles or in GitHub. I’m not sure I’ll accept PRs, but will be happy to share with readers others’ comments and suggestions if you share them with me.
How to use the Repository
My repository of this code is available on GitHub. The expected interaction with this code is:
1. Clone the repo on to your local machine
2. Spin up the Docker image using the commands in build.sh and run.sh and the readme. The docker container will automatically create any objects I’ve created up to the time you cloned the repo when it is started, and if you pull more and restart the container, they should be created again.
3. Explore my solutions with the provided Jupyter Notebook for each day or with other commands in the container itself.