Database Migration: What It Is and How to Do It

Written by: amanda
9 min read

Around the same time Git got popular, the trend for writing web-based applications using object-relational mapping (ORM) libraries also became well known. The key idea was this: since developers can make changes in code that are easy to roll back using Git, why can't developers do the same thing when it comes to schema changes? After all, any reasonable new feature involves code and schema changes. So popular frameworks like Rails and Django added ORM and database migration (also known as schema migration) as part of their offerings. But database migration as a concept is not restricted to popular web frameworks. There are even standalone database migration software libraries like Flyway and Liquibase. Can you imagine being able to roll back granular changes to the schema as you write your code? Too often, I find that articles about database migration don't discuss what it means to actively do one, as a developer. I want to correct that here. With that in mind, today I'll give you the big-picture view of what database migration involves and how to do it in an active development environment. I'll break this up into three sections, covering what happens during the migration process, the common tools used, and the pitfalls that can crop up during database migration. This should give a beginner developer the tools to quickly learn the key ideas behind data migration and also learn potential pitfalls to avoid when using database migration as part of his or her development toolbox.

What Happens During Database Migration?

Now that you know how database migrations came about, let me walk you through what they actually entail.

Granular Changes Are Generated as Individual Scripted Files

I mentioned how database migrations basically track granular changes to your database schema (and sometimes to your data as well). These granular changes are typically reflected as separate scripted files. That way, your granular schema changes are reflected as code that can be captured with any version control software. This is an example of a migration file in Rails.

class CreateProducts < ActiveRecord::Migration[5.0]
  def change
    create_table :products do |t|
      t.string :name
      t.text :description

      t.timestamps
    end
  end
end

Not only can you have migrations as standalone files, you can also have the current database schema as its own file. Typically, this is known as a schema file.

Database Migration Scripts Are Tool-Dependent

Recall the database migration script in Ruby from our earlier example. Here's another database migration file from the independent source control for the database software Liquibase.

<?xml version="1.0" encoding="UTF-8"?>

<databaseChangeLog
  xmlns="http://www.liquibase.org/xml/ns/dbchangelog"
  xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
  xsi:schemaLocation="http://www.liquibase.org/xml/ns/dbchangelog
         http://www.liquibase.org/xml/ns/dbchangelog/dbchangelog-3.1.xsd">

    <changeSet id="1" author="bob">
        <createTable tableName="department">
            <column name="id" type="int">
                <constraints primaryKey="true" nullable="false"/>
            </column>
            <column name="name" type="varchar(50)">
                <constraints nullable="false"/>
            </column>
            <column name="active" type="boolean" defaultValueBoolean="true"/>
        </createTable>
    </changeSet>

</databaseChangeLog>

This time, the file is in XML format. In actual fact, Liquibase can produce migration changes (or changesets, as they like to call them) in multiple formats such as XML and JSON. So unless you want to handwrite your database migration in SQL format, there are no real standards across how the migration files are created. Of course, you can write your own custom database migration scripts in the SQL files. But why handwrite and accidentally introduce bugs when you can make use of tools to autogenerate the database migrations?

How Do You Perform a Database Migration?

Now you know what a database migration is. You also need to know (at least conceptually) how to perform a database migration. Unfortunately, there are no standards at all for database migration files, which means we can't get into too much detail on how to create them. In other words, how you perform a database migration depends heavily on the specific tool you use for the task. In this section, I'll cover the two most common ways to perform a database migration.

Option 1: Use a Framework/Language-Dependent Library

If you use a popular language (Ruby, PHP, Python, etc.) or framework (Rails, Django, etc.), then there are well-documented libraries for data migration within the universe of the framework/language chosen. The popular web frameworks will naturally come bundled with the database migration features. Depending on the setup, you can even swap out the native database migration for other libraries within the language chosen. Typically in this scenario, you generate the migration files using the command line. Occasionally, you may need to handwrite the custom code for some changes, such as data migration or even how to reverse the change itself. Other than that, the library chosen within the framework/language pretty much takes care of this for you.

Option 2: Use Independent Database-Migration-Focused Software

In some cases, you'll want to use software like Flyway or Liquibase that just acts as a source control for your database. By doing this, you avoid being locked into a particular framework or language. This does not mean you have zero lock-in. Take, for example, Flyway, which works well with various databases such as Oracle, MySQL, and MariaDB. If you don't use Flyway's Java-based migration (which locks you into Java and Flyway), you may end up using SQL-based migration (which locks you to your choice of database and Flyway). The good news is Flyway and Liquibase are both relatively easy to use. They, too, provide a command-line way to generate the migrations and allow custom coding to capture the database schema migrations.

Choosing the Best Option for You

Here's my opinion on choosing between either option. I tend to see most developers choose option 1. Usually, that's because developers already have a preferred language/framework, so cognitively speaking, it doesn't feel like they're learning (yet) another piece of software. And the lock-in (in the case of option 1, this would be to the framework and language) is entirely voluntary and desired. You can make the case for option 2 if your team hasn't fully committed to a particular language or framework. Whichever you go for, avoid changing your choice unnecessarily midway through development. Database migration is not an area where changing the tools yields you tremendous benefits. Most of the benefits come from simply having database migration set up in the first place.

The Dangers of Database Migration and How to Avoid Them

In the previous section, I covered the two major types of tools for database migration: framework/language-dependent and standalone software. Both types are easy to use. Now, I'll recommend three best practices when it comes to actually performing a database migration. These tips address the main danger of database migration: you want to avoid making irreversible changes.

Choose One Database Migration Tool and Stick to It

I mentioned this briefly earlier, but it's worth emphasizing: I want to warn against making unnecessary changes to the toolset simply because it's cool. Recall that database migration scripts are dependent on the tool you use to generate them. There's no proper standards for these scripts. If you really want to, you can even write your own in SQL statements. That introduces its own set of issues; for instance, your SQL scripts are likely able to be database dependent. MySQL queries may not work in PostgreSQL, and vice versa. So either you are locked into your database migration tool, or you're locked into your choice of database—or in some cases, even both. There's no obvious benefits to switching your toolset often, so my suggestion is to choose one database migration tool and stick to it. There are better things to do with your time than spin your wheels needlessly.

Delete Rows or Columns Only When You're Absolutely Certain You Need To

When it comes to migrations, for the most part, you can reverse them when you need to. Typical database migration tools can handle simple reversible changes. And even when they can't, you can easily add your own custom code to help with reversing. For instance, in Rails, you simply add the code under reversible.

class ChangeProductsPrice < ActiveRecord::Migration[5.0]
  def change
    reversible do |dir|
      change_table :products do |t|
        dir.up   { t.change :price, :string }
        dir.down { t.change :price, :integer }
      end
    end
  end
end

The hardest changes to reverse tends to involve deleting stuff: specifically, deleting columns or rows of data. If your database contains a lot of data, you may end up having to write a significant amount of custom code to try and reverse your changes. So if you're considering deleting data from your database, be extra careful. Those types of changes are hard to undo. Other hard-to-reverse changes include renaming columns and changing the datatype of a column that already contains data. One rule of thumb I like to go with is this: you should almost never delete columns until the next major (you do know semver, right?) release. This rule applies even if I have columns I want to stop using. Instead of deleting columns, I'll announce in the minor versions which columns will be deprecated going forward. When it comes time for a major release, I will then drop those columns entirely. That way, I reduce the chances of performing an accidental, unwanted change that will be an awful process to reverse by hand.

Implement Feature Flags

Another excellent strategy for database migration is implementing feature flags, especially when you have a large team and multiple people are trying to implement different features for the same codebase. According to Martin Fowler, feature flags are "a powerful technique, allowing teams to modify system behavior without changing code." In fact, the bigger your team and the more complex your codebase is, the more feature flags shine. Feature flags help mitigate risk when it comes to database migrations. Trying to untangle the changes when you need to roll back certain features can be a real nightmare. Feature flags work just as well whether you want to make small or big database schema changes. When using feature flags, I recommend you aim for more frequent and granular changes overall. And you can really see how feature flags help your development process when you perform a massive schema change. Please still aim to break this massive schema change into smaller, thin slices. You don't need to impress anyone with how much you can squeeze in one change. Better to add safeguards to your development practices by implementing all three recommendations I've talked about in this section.

Conclusion

Developers need all the tools they can get to make their lives easier. Database migration is one of those must-have tools for the developer's toolbox. Going forward, I foresee that employing database migrations will evolve from a development best practice to a development standard practice. People will look at you funny for not using database migration at all. Still, it's important to keep in mind how database migrations can backfire on you, particularly for hard-to-reverse schema changes. Be absolutely certain of those changes before you make them go live. If you're not already using database migration in your process, what are you waiting for? It's easy to start, and you can even add it in midway through your development cycle. You'll be thankful you did. Because when the day comes that you need to roll back your changes and the changes involve the database, you'll wish you had something to help you do that in mere seconds. Database migration is that something you'll wish you had.

Stay up to date

We'll never share your email address and you can opt out at any time, we promise.