Book Review: Refactoring Databases

Refactoring Databases, $49.99
by Scott W. Ambler and Pramod J. Sadalage
Addison-Wesley, 2006
350 pages
Examples for Oracle, Hibernate, and Java
ISBN 0-321-29353-3
http://www.amazon.com/gp/product/0321293533

If you ever read Martin Fowler's original Refactoring book, you have a pretty good idea of the format of this one. Ambler and Sadalage take the same principles of incremental redesign with testing that Fowler enunciated so well and apply them to databases instead of code, showing how the careful DBA can work together with developers to help a database design evolve by baby steps, always working and getting better each time. The authors recognize that there are practical problems involved here - the tool support for database refactoring is as primitive as that for developer refactoring was when Fowler's book came out, and politics gets in the way when you ask DBAs to talk to developers - but their work goes a reasonably long way to show what this field can look like in a mature and well-organized shop.

For each refactoring, the authors take a structured approach that crosses the DBA and development domains. They lay out the motivation for the refactoring, the potential tradeoffs, the actual mechanics of changing the schema, the mechanics of migrating data, and the necessary updates to any access programs. The three "mechanics" sections are likely to be the most valuable to experienced professionals. You may, for example, have a good handle on the fact that you'd like to change a column to use a standard data type to conform to a corporate standard, but have you really thought through all the implications of this move? Ambler and Sadalage give you a good solid checklist of things to consider, from identifying the impacts to rewriting the application code.

The book contains several dozen refactorings, grouped into five chapters: structural refactorings (adding, dropping, and renaming database entities), data quality refactorings (mostly classic normalization moves), referential integrity refactorings, architectural refactorings (such as adding CRUD methods or migrating methods to or from application code), method refactorings (changing interfaces or changing the logic in database code), and transformations (adding new features to the schema). Others will no doubt invent other database refactorings, but it's a good starting catalog. Beginners will do well to read through the entire list, noting patterns for future use; more experienced practitioners will be able to skim the list and find things that apply to their current problems.

Some of the ideas here will be nothing new to experienced DBAs, of course, who will see them as nothing more than standard normalization. And viewed on a micro level, no single piece of advice here is likely to strike you as revolutionary. The importance of the book comes, I think, in its attempt to put some structure around the process of evolving and refining a database design as part of a suite of applications. The authors discuss issues like regression testing and test-first development and deploying into production in the context of databases. While it can be difficult to find tool support for these ideas, depending on your software environment, they're an important part of moving your application development away from chaos and into a more controlled regime. So even if you don't end up taking a refactoring approach to database work, you may still find this an inspirational and useful book.

Mike Gunderloy is the lead developer for Larkware and author of numerous books and articles on programming topics.

Published April 10, 2006