Comparing Db2 and MySQL: Generating Values

I’m going to walk through the basic creation of auto-incrementing integer columns in MySQL and Db2 and point out some differences and similarities.

Functionality

Every relational database I have worked with has had a functionality for generating column values. The most basic generation is an incrementing integer. This is often used for the assignment of artificial keys. In general, there are plusses and minuses to using an artificial key. The Active Record ORM in Rails makes heavy use of this kind of functionality by default. Natural or composite primary keys may have performance advantages in some cases or cause problems in others.

I have seen problems caused by key collisions when keys are populated by an application instead of the database. Particularly when in an environment where many applications access the same database, it is nice to implement this functionality at the database layer.

Creating a table with an auto incrementing column

Db2

create table t4 (
    id bigint not null generated by default as identity
    , name varchar(45));
create unique index ix_t4_pk on t4 (id);
alter table t4 add primary key (id);

Note that there are other syntaxes I could choose for the identity column here, I’m not going to explore all of those in this article. I’m also using Db2 best practices for creating an index before assigning it as the primary key to avoid strange system names for the index.

MySQL

create table `t4` (
    `id` bigint not null auto_increment
    , `name` varchar(45)
    , primary key (`id`));

Here, no separate specification of the primary key is needed. I’ve used back-ticks around identifiers as a best practice to avoid conflicts with reserved words.

Inserting data

Let’s look at a series of basic operations and what it tells us about each rdbms

Db2

$ db2 "insert into t4 (name) values ('blue'),('green')"
DB20000I  The SQL command completed successfully.
$ db2 "insert into t4 values (DEFAULT, 'yellow')"
DB20000I  The SQL command completed successfully.
$ db2 "select * from t4"

ID                   NAME
-------------------- ---------------------------------------------
                   1 blue
                   2 green
                   3 yellow

  3 record(s) selected.

As expected, ID values were assigned here, in order. Syntax does not allow us in Db2 to specify NULL for a not-null column, so we must use the keyword DEFAULT if the generated column is specified.


$ db2 "insert into t4 values (-1, 'orange')"
DB20000I  The SQL command completed successfully.
$ db2 "insert into t4 values (5, 'red')"
DB20000I  The SQL command completed successfully.
$ db2 "select * from t4"

ID                   NAME
-------------------- ---------------------------------------------
                   1 blue
                   2 green
                   3 yellow
                  -1 orange
                   5 red

  5 record(s) selected.

Because we used generated by default in the definition of our identity column, we can add values out of sequence if we like, and the values are accepted.


$ db2 "insert into t4 (name) values ('purple'),('violet')"
DB21034E  The command was processed as an SQL statement because it was not a
valid Command Line Processor command.  During SQL processing it returned:
SQL0803N  One or more values in the INSERT statement, UPDATE statement, or
foreign key update caused by a DELETE statement are not valid because the
primary key, unique constraint or unique index identified by "1" constrains
table "DB2INST1.T4" from having duplicate values for the index key.
SQLSTATE=23505

What this error tells us is that we have to be careful using any values that are above the current value of our identity column. Unlike MySQL, db2 does not handle this or increase the value based on numbers we’ve manually inserted. This means when the identity naturally gets to the previously specified value of 5, a key conflict happens. We can run an alter table statement to jump to an unused range to get around this.

$ db2 "alter table t4 alter id restart with 6";
DB20000I  The SQL command completed successfully.
$ db2 "insert into t4 (name) values ('purple'),('violet')"
DB20000I  The SQL command completed successfully.
$ db2 "select * from t4"

ID                   NAME
-------------------- ---------------------------------------------
                   1 blue
                   2 green
                   3 yellow
                  -1 orange
                   5 red
                   6 purple
                   7 violet

  7 record(s) selected.

After restarting the identity column, we can then generate values that do not conflict with data already in the table.

MySQL

mysql> insert into t4 (name) values ('blue'),('green');
Query OK, 2 rows affected (0.09 sec)
Records: 2  Duplicates: 0  Warnings: 0

mysql> insert into t4 values (DEFAULT, 'yellow');
Query OK, 1 row affected (0.01 sec)

mysql> select * from t4;
+----+--------+
| id | name   |
+----+--------+
|  1 | blue   |
|  2 | green  |
|  3 | yellow |
+----+--------+
3 rows in set (0.01 sec)

This works very similarly to Db2, though MySQL allows the use of NULL instead of DEFAULT for unspecified auto increment values.


mysql> insert into t4 values (-1, 'orange');
Query OK, 1 row affected (0.00 sec)

mysql> insert into t4 values (5, 'red');
Query OK, 1 row affected (0.00 sec)

mysql> select * from t4;
+----+--------+
| id | name   |
+----+--------+
| -1 | orange |
|  1 | blue   |
|  2 | green  |
|  3 | yellow |
|  5 | red    |
+----+--------+
5 rows in set (0.00 sec)

Clearly, MySQL also allows the specification of values for the auto increment column.


mysql> insert into t4 (name) values ('purple'),('violet');
Query OK, 2 rows affected (0.00 sec)
Records: 2  Duplicates: 0  Warnings: 0

mysql> select * from t4;
+----+--------+
| id | name   |
+----+--------+
| -1 | orange |
|  1 | blue   |
|  2 | green  |
|  3 | yellow |
|  5 | red    |
|  6 | purple |
|  7 | violet |
+----+--------+
7 rows in set (0.00 sec)

Here we see a difference from Db2 again. When we inserted the value of 5, MySQL changed the value to be used next for us to avoid a primary key collision. The value of 4 was never used, so we have to be careful specifying very high values, as we could run out of key values.

We might try to recoup these leaked values, but MySQL will not let us shoot ourselves in the foot, while Db2 hands us the shotgun:

mysql> alter table `t4` auto_increment = 4;
Query OK, 0 rows affected (0.08 sec)
Records: 0  Duplicates: 0  Warnings: 0

mysql> insert into `t4` (`name`) values ('grey'),('black');
Query OK, 2 rows affected (0.01 sec)
Records: 2  Duplicates: 0  Warnings: 0

mysql> select * from t4;
+----+--------+
| id | name   |
+----+--------+
| -1 | orange |
|  1 | blue   |
|  2 | green  |
|  3 | yellow |
|  5 | red    |
|  6 | purple |
|  7 | violet |
|  8 | grey   |
|  9 | black  |
+----+--------+
9 rows in set (0.00 sec)

mysql> delete from `t4` where `id` >= 4;
Query OK, 5 rows affected (0.01 sec)

mysql> select * from t4;
+----+--------+
| id | name   |
+----+--------+
| -1 | orange |
|  1 | blue   |
|  2 | green  |
|  3 | yellow |
+----+--------+
4 rows in set (0.00 sec)

mysql> alter table `t4` auto_increment = 4;
Query OK, 0 rows affected (0.01 sec)
Records: 0  Duplicates: 0  Warnings: 0

mysql> insert into `t4` (`name`) values ('grey'),('black');
Query OK, 2 rows affected (0.01 sec)
Records: 2  Duplicates: 0  Warnings: 0

mysql> select * from `t4`;
+----+--------+
| id | name   |
+----+--------+
| -1 | orange |
|  1 | blue   |
|  2 | green  |
|  3 | yellow |
|  4 | grey   |
|  5 | black  |
+----+--------+
6 rows in set (0.00 sec)

We can only set the auto increment to a value higher than the highest value in the table.

Some differences and thoughts

MySQL also only allows one auto_increment column per table, while Db2 will allow multiple identity columns. Db2 also has the ability to define the identity column as generated always, which will prevent us entirely from ever specifying a value, forcing the use of values in order, though there are ways of getting around that with modifiers on the IMPORT and LOAD statements. MySQL also has a GENERATED ALWAYS functionality, but it cannot be used in conjunction with an auto_increment, and is for physical or virtual generated values based mostly on other columns in the table.

Ember Crooks
Ember Crooks

Ember is always curious and thrives on change. She has built internationally recognized expertise in IBM Db2, spent a year working with high-volume MySQL, and is now learning Snowflake. Ember shares both posts about her core skill sets and her journey learning Snowflake.

Ember lives in Denver and work from home

Articles: 556

One comment

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.