Isolating and Evolving DB Migrations in Large Projects (Part 4)


How do you find yourself managing a complex database using Rails? and exactly what
sort of issues are you confronted with? Four years ago, when you started your new project you
decided to select Rails as a technology, you were impressed by the agility of the
framework. Back then, you had::

  • All of your model objects implemented as Rails model
    objects.
  • Every database change modeled as a Rails migration.

Four Years pass…

What was once a prototype and three developers is now a thirty
person team with multiple levels of management. Suddenly
you find yourself:

  • Providing APIs for several related applications in the
    company.
  • Running a mixture of mostly Java and Rails.
  • Supporting a massive, mission critical database with multiple
    client applications.
  • Having nightmares about deployment automation and change management
    (you miss the simpler days of the prototype).

From a code perspective, you are still using Rails migrations, but
there are now some complicating factors:

  • That simple-web project in the diagram, it still contains the
    Rails migrations, and it still contains a set of Rails model
    objects. When the team deploys a production build, someone unpacks
    the simple-web project on a production box and fires up “rake
    db:migrate”. It is a product of history, but it does seem odd that
    the database change scripts are associated with this single,
    somewhat minor part of the system
    .
  • The team decided to
    “bring in the heavy” and implement some interaction patterns between
    subsystems using JMS and JDBC using distributed transactions.
    There is a duplicate object model implemented in Shared API which
    contains Hibernate annotations
    .
  • There are a number of database tables that have nothing to do
    with the Rails application, but, because the database change scripts
    are stored in simple-web people are constantly having to interact
    with this project.
  • Your use of the database is considerably more complicated than
    the initial support offered by Rails, you:

    1. Quickly added in a
      GEM for Foreign Key support
    2. Crafted a custom solution to
      manage reference data
    3. Are storing a series of views, stored
      procedures, and MySQL events alongside your migrations
    4. Have
      started to pass in MySQL specific SQL options for partitions and
      storage engine (some of your tables are InnoDB, others are MyISAM).

In summary, it gets messy once you’ve got a real project. It
always does. Even though you started with a simple, agile
framework, the sort of complex problems you encounter in a real
software enterprise have a habit of taking the fun out of everything.

One Reaction: Separate DB Change Management from Code

One of the common problems out there is that your database
change scripts are hooked to a particular language, framework, or
technology.

If you do use Rails Migrations to manage your database changes,
consider creating a separate project and not conflating your real
application with your Rails migrations. To someone who uses Rails
this isn’t going to make any sense at all as it runs counter to one of
the first experiences someone has when they create a Rails
model object. If running “rails generate model Product
name:string” automatically creates a migration, how can this be
bad?

Initially, it isn’t. It is very convenient to have your migration
just pop into being alongside your model object. Fast
forward four years, add in a couple of different technologies and
hire a DBA, and you’ll understand why you want to try to create a
separate project for your db change scripts. At a certain point in
the progression of a project, the primary motivation is no longer
convenience, it is separation. Rails applications make a
simplifying assumption that they are “the application”, but you
should excise the migrations once you realize that the project is
going to start incorporating other components.

Don’t pre-optimize. If you are developing a new system, don’t
create a separate Rails project to solve a problem you don’t yet
have, but do consider writing Rails Migrations that have nothing to
do with your applicaiton logic. Don’t reference model objects or Ruby libraries from
migrations (there are many reasons not to do this beyond separation),
and make sure you have a strategy for spinning rails migrations into
an isolated project.

Another Reaction: Consider transitioning to SQL

When you use Rails Migrations to manage database changes,
what you are really doing is asking everyone who needs to make
changes to become familiar with Rails. In a larger project, what was
once viewed as an advantage can start to feel like a limitation.
Plain old SQL can be a liberating, lowest common denominator.

At year one, using Rails-specific (or Liquibase-specific)
syntax for creating new tables makes sense. At the start of a
project, the people making changes to the datbase are usually
programmers.

Once a project has been in production for a few years and gained a
certain level of importance, your department will probably hire a DBA
to defend the database. DBAs…. they don’t particular care for
DSLs that simplify SQL. They like
SQL, and they don’t care what some popular web framework thinks about
the database. (Everyone says this, and it is true) DBAs are a
strange breed, they have strong opinions, and that’s what you pay them
for.

If you’ve ever seen a good DBA work, you’ll know what I mean when I
say that these people tend to think in SQL, and what might seem like a
straightforward CREATE TABLE statement to a programmer can quickly
turn into a sea of database specific options defining cache sizes and
column options in the hand of a DBA. If you throw them some tool
like Liquibase or Rails Migrations, they might not make fun of you to
your face, right away, but give them a day or two.

Once your team is larger and more diverse in terms of skillset and
technology, you’ll want to standardize on plain old SQL as a “lingua
franca” for database changes. As the start of the project, database changes are developer “turf”, one the project has matured database changes become a shared “development/operations” responsibility, and SQL is a good middle group between the two different groups.