The code base I work with is in a pretty bad state. Much better than it was a year ago, but that’s still faint praise. At times I can avoid thinking about it, especially when we’re focusing on some new piece of functionality that is mostly independent of the legacy code. Other times I have to wade through all this old muck to get something done. That really makes me want to clean up, go in with a big broom and sweep all the dirt away. Whenever we’re ahead of schedule I take a day or so to focus on refactoring.
A day is a very short time to spend cleaning a big code base, so I want to make sure I spend it well. But the code is so messy that it’s hard to decide where to begin. It’s like when you have a house that hasn’t been cleaned for years: whichever part you start cleaning, something else grabs your attention because from another angle that looks even worse than the part you’re working on.
Do I start from the parts I need to touch most often?
From the most important parts?
From the dirtiest parts?
From the parts where I can get the most done in the shortest time?
From the riskiest, most complex parts?
Every time I face this decision I make a different choice, but in general I tend to pick the first of these strategies. The files I need to work with regularly get cleaner, but there are many places that I have never even looked at, and I’m sort of hoping that I never will, although I’m sure the day will come.
Have you read Michael Feathers’ “Working Effectively with Legacy Code”? I believe he has some good thoughts on the subject.
Actually, it was reading that book that made me want to clean everything up! Michael Feathers has lots of good thoughts, but this was the one point where I thought his advice was less useful. His approach is to refactor to testable code whenever you need to make a change, but that’s a poor yardstick for prioritizing code for refactoring, in my opinion.
PS: Book review here, now.