Non-blocking Database Migrations – DZone Database

Aleksei Kankov
Published: September 20, 2022

Database migrations are a common part of any web application. They are used to update the database schema to match the application’s code. In a traditional web application, the database migrations are run synchronously, which means that the application is blocked until the migration is complete. This is not ideal, as it means that the application is unavailable to users during the migration. Long past the days when stopping the service for maintenance was acceptable; we need to be able to run migrations without blocking the application.

It’s easy to perform database migrations in small databases or if you have no load. But what if you have a large database and a lot of users?

Initial Database Structure

Let’s imagine we have a simple table for storing customer data:

DROP TABLE IF EXISTS customer;
CREATE TABLE
    customer (
    id      INT UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY,
    email   VARCHAR(256),
    balance FLOAT        NOT NULL DEFAULT 0,
    UNIQUE INDEX email_idx (email)
);
INSERT
INTO
    customer (email, balance)
VALUES
    ('alice@example.com', 42),
    ('bob@example.com', -42);

My main focus will be on the “balance” field.

It’s a float, which means that it can have a fractional part. We want to change the type of this field to an integer so that we can store the balance in cents. This is a common practice in financial applications.

But perhaps you already have an active application with millions of users. As a developer, your responsibility is to make sure the application is available to users during the migration. Also, you need to make sure the application will work correctly after the migration, and in case it’s not working properly, you have to roll back your changes without many customers noticing it.

So the plan is:

  1. Create a new field “balance_cents” with the integer type.
  2. Deploy the application with the new field. A new application version should be available for a limited amount of users.
  3. Make sure the application is working correctly with the new field.
  4. Deploy the new application version for all users.
  5. Drop the old field “balance”.

The tricky part is to make data consistent between the old and new fields. As these fields are related to each other, we need to make sure changes in the old field are reflected in the new field, and vice versa.

Let’s go step by step and see how we can implement this plan.

Create a New Field

Creating a new field “balance_cents” with the integer type is a simple task:

ALTER TABLE
    customer
    ADD COLUMN
        balance_cents INT NOT NULL DEFAULT 0;

Set Up Synchronization Between the Old and New Fields

At this point, the application does not know anything about the new field, and it is a perfect time to set up synchronization between the old and new fields. And to achieve this, we need to create triggers

First, let’s create a trigger on the insert.

DROP TRIGGER IF EXISTS insert_balances;
CREATE TRIGGER insert_balances
    BEFORE INSERT
    ON customer
    FOR EACH ROW IF new.balance <> 0 THEN -- insert from the old code
    SET
        new.balance_cents = CEIL(new.balance * 100);
ELSEIF new.balance_cents <> 0 THEN -- insert from the new code
    SET
        new.balance = new.balance_cents / 100;
END IF;
$$
DELIMITER ;

Let’s do some experiments to see how this trigger works.

Insert from the old code:

INTO
    customer (email, balance)
VALUES
    ('account_from_old_code@example.com', -1.23);
SELECT
    email,
    balance,
    customer.balance_cents
FROM
    customer
WHERE
    email="account_from_old_code@example.com";

+-----------------------------------+---------+---------------+
| email                             | balance | balance_cents |
+-----------------------------------+---------+---------------+
| account_from_old_code@example.com |   -1.23 |          -123 |
+-----------------------------------+---------+---------------+

Insert from the new code:

INSERT
INTO
    customer (email, customer.balance_cents)
VALUES
    ('account_from_new_code@example.com', 345);
SELECT
    email,
    balance,
    customer.balance_cents
FROM
    customer
WHERE
    email="account_from_new_code@example.com";

+-----------------------------------+---------+---------------+
| email                             | balance | balance_cents |
+-----------------------------------+---------+---------------+
| account_from_new_code@example.com |    3.45 |           345 |
+-----------------------------------+---------+---------------+

So the trigger works as expected. 

Now let’s create a trigger for an update.

DROP TRIGGER IF EXISTS update_balances;
DELIMITER $$
CREATE TRIGGER update_balances
    BEFORE UPDATE
    ON customer
    FOR EACH ROW IF new.balance <> old.balance THEN -- update from the old code
    SET
        new.balance_cents = CEIL(new.balance * 100);
ELSEIF new.balance_cents <> old.balance_cents THEN -- update from new code
    SET
        new.balance = new.balance_cents / 100;
END IF;
$$
DELIMITER ;

And now let’s test it.

Making updates from the old code:

UPDATE customer
SET
    balance = -1.45
WHERE
    email="account_from_new_code@example.com";
SELECT
    email,
    balance,
    balance_cents
FROM
    customer
WHERE
    email="account_from_new_code@example.com";

+-----------------------------------+---------+---------------+
| email                             | balance | balance_cents |
+-----------------------------------+---------+---------------+
| account_from_new_code@example.com |   -1.45 |          -145 |
+-----------------------------------+---------+---------------+

Making updates from the new code:

UPDATE customer
SET
    balance_cents = 567
WHERE
    email="account_from_new_code@example.com";
SELECT
    email,
    balance,
    balance_cents
FROM
    customer
WHERE
    email="account_from_new_code@example.com";

+-----------------------------------+---------+---------------+
| email                             | balance | balance_cents |
+-----------------------------------+---------+---------------+
| account_from_new_code@example.com |    5.67 |           567 |
+-----------------------------------+---------+---------------+

Our triggers work as expected. Now we need to fill the empty “balance_cents” field with data from the “balance” field.

Filling the Empty “balance_cents” Field

The simplest way to fill the empty “balance_cents” field is to use the UPDATE statement:

UPDATE customer
SET
    balance_cents = CEIL(balance * 100);

But that update query will put a lot of pressure on the database. And as our main goal is to avoid downtime, the update process should be performed in small batches.

It is possible to create a migration script inside the application, but as we are playing with SQL, let’s create a stored procedure.

DROP PROCEDURE IF EXISTS batch_update_balance_cents;
DELIMITER $$
CREATE PROCEDURE batch_update_balance_cents(
    start_id INT,
    end_id INT,
    batch_size INT)
BEGIN
    DECLARE batch_start INT DEFAULT start_id;
    DECLARE batch_end INT DEFAULT start_id + batch_size;
    IF end_id < start_id + batch_size THEN
        SET end_id = start_id + batch_size;
    END IF;
    WHILE batch_end <= end_id
        DO
            UPDATE customer
            SET
                balance_cents = CEIL(balance * 100)
            WHERE
                id BETWEEN batch_start AND batch_end;
            SET batch_start = batch_start + batch_size;
            SET batch_end = batch_end + batch_size;
        END WHILE;
END$$
DELIMITER ;
CALL batch_update_balance_cents(1, (SELECT
                                        MAX(id)
                                    FROM
                                        customer), 1000);

And now let’s check the result:

SELECT *
FROM
    customer;

+----+-----------------------------------+---------+---------------+
| id | email                             | balance | balance_cents |
+----+-----------------------------------+---------+---------------+
|  1 | alice@example.com                 |      42 |          4200 |
|  2 | bob@example.com                   |     -42 |         -4200 |
|  3 | account_from_old_code@example.com |   -1.23 |          -123 |
|  4 | account_from_new_code@example.com |    5.68 |           568 |
+----+-----------------------------------+---------+---------------+

All old entries were updated.

Drop Triggers and Stored Procedure

Migration is done. Everyone is happy about our latest changes. New application code is deployed to all customers.

It’s time to drop triggers and stored procedures.

DROP PROCEDURE IF EXISTS batch_update_balance_cents;
DROP TRIGGER IF EXISTS update_balances;
DROP TRIGGER IF EXISTS insert_balances;

Drop the Old Field

And now, no one uses the old field. It’s time to drop it.

ALTER TABLE customer
    DROP COLUMN balance;

Conclusion

In this article, we have shown how to migrate from one field to another without downtime.

We have used triggers and stored procedures to keep the data in sync.

I used a particular example with the balance field, but it’s possible to use the same approach for any other field or set of fields.

Source: dzone.com