As your Database Scales, can Rails Migrations Keep Up?


Over the next two weeks, you’ll see a series of posts comparing Rails
Migrations
to Liquibase. I’m writing this series of posts because I’m
interested in exploring some of the advantages and
disadvantages of using these tools to support a large, enterprise
database. Despite the fact that Enterprise Java development is an
object of ridicule in the larger Rails community, many Java developers
adopted Ruby as a second language, and many have integrated Rails into
larger enterprise applications. It has been a few years since many of
us embraced Rails in an enterprise context, and the community has had
time to evolve projects such as Liquibase.

Case Study: Four Years of Migrations

Consider the case of a business that integrated Rails into a prototype
for a very large application in 2006. At the time, the application
was a prototype with tens of tables and two or three supporting
applications. While the majority of the system is in Java using
Spring as a services layer and Hibernate for persistence, the
user-facing web sites are all implemented in Rails. The application
is packaged up as a collection of WARs and deployed on JBoss. While
there is still a definite boundary between Ruby code and Java code, it
is very easy to bridge the gap and create nimble, lightweight web
applications which are able to “bring in the heavy guns” of JTA
transactions and JMS interactions when they need to.

Rails provides an elegant solution
for tracking database changes (Rails Migrations); it was the obvious choice to manage database
changes in 2006.

Four years later, the project still uses Rails migrations.
Migrations are applied to local workspaces during the development
effort, and QA databases during a nightly release to a QA database.
When the project is released to a staging or a production network, the
release process also involves running Rails migrations against a
production network. While Rails still works (and works well), there
are some issues that are starting to crop up, among them:

There’s no way to build from scratch…

While it was very easy to maintain the ideal of building the database
from scratch for two or three years, big, production databases often
don’t lend themselves to a clean set of linear changesets. DBAs
jump into the mix and start “tweaking” tables using statements not
captured in Rails migrations, data operations in production vs. data
operations in QA have introduced “discontinuities” in data which
can’t be captured as migrations. This isn’t ideal, but new
developers are required to start with a semi-recent snapshot of
production or QA. The migrations can probably be applied to a
snapshot from about a year ago, but try running them against an
empty database or a very old snapshot, and you won’t have a faithful
representation of either the production or QA database.

I know, I know, schema dump, right? No. Again, think about a
hypothetical database that far exceeds the assumptions of Rails (many stored procs,
triggers, views, MySQL events, tweaked table-specific storage-engines,
etc.) Running schema dump on this creates a very simplistic view of a real database. Rails’ idea of a database (simple tables and columns, no FKs) is like sizing an iceberg based on what you can see, above water.

Now this is clearly a social problem, people have applied changes
to the database directly in production instead of writing migrations.

Rails Migrations are Flexible… almost too Flexible…

You can reference ActiveRecord model objects in a Rails migration,
and, even though this is heavily discouraged, there’s no stopping a
motivated developer from doing this. Instead of figuring out some
SQL statements to perform a change to the database, it is too easy
just to fall back on ActiveRecord. Very often Rails migrations will
fail when run in the production environment vs. the development
environment because a developer decided to reference a model object.
The work-around for this is to include any classes referenced in an
(often gigantic) eval block.

Beyond that, developers are often conflating DDL with data
manipulation in the same migration. Again, I don’t mean to sound
like a curmudgeon, but there’s little in the way of enforcement in a
Rails migration, and as the team size grows you start wondering if it
really makes sense for every developer to feel liberated to just whack
another column into another table using either SQL or Ruby depending on how they feel that day. Migrations are great, but they can also turn into a free-for-all of individual styles and preferences.

DBAs despise tools like this…

They just do. DBA: “So how do you manage database changes?” You: “We use Rails
Migrations.” DBA: frowny face.

That answer is probably one of the last answers a DBA
wanted to hear because it screams out “our developers can just execute
DDL, no restrictions, no workflow”. As a developer, you probably
don’t want to hear this, but as soon as your database starts to be
worth more than a couple of million dollars, technology management is
going to want to hire someone to actively defend that
database against over-eager developers.

I’m a developer, but I’ve also done my fair share of operations
work. There’s nothing worse than a developer who feels liberated to
change database structure making an impossible change in a development
environment. Here’s an example, assume that you have a table with 30
million rows in production. In development, you might just truncate that table just
to save space. So you truncate the table and then you realize that
you need to create a big date index on a specific table. You write this
migration, you check it into source code… and then the next time
operations deploys to production they have to deal with a migration
that adds a massive index to a massive table. Developer: “What’s
the problem?” Sysadmin: “This migration has been running for four
hours, I don’t know how much is left, and I’m also running out of disk
space….”

What’s the solution to this? Giving the DBAs SQL (and I’m sure
there’s a way to coax SQL out of a series of Rails migrations), but
there’s still a disconnect between DBAs and Developers.

I’m being crushed by Migrations…

You’ve been using Rails migrations for years and years, and your
problem now has something like 500 migrations. There’s this big block
of migration files in db/migrate, and you are constantly wondering why
you are required to cart around all of this history.

Comparison: TBD

Instead of just addressing some of these issues with the existing tool, I
think it makes more sense to take a look around at alternatives.
Over the next two weeks I will be comparing Rails Migrations to
Liquibase. Stay Tuned.

Note: Want to learn more about Liquibase? Contact Tim Berglund,
check out his Practical
Agile Database Development
talk at the nearest NFJS.