Wednesday, August 26, 2009

Start Refactoring Unfamiliar Code by Starting Refactoring

I refactor my own code frequently. It is often relatively easy to refactor my own code because I know my code well. In fact, the process of maintaining and reading my own code often provides the motivation to refactor it. If the code is covered by comprehensive unit and regression tests, this refactoring is even easier. I have found that refactoring of unfamiliar complex code is much more difficult. In this blog posting, I look at the most important step I have discovered in refactoring other developers' complex code.

The concept of code refactoring has existed probably as long as software development, but the coinage of the term "refactoring" and the subsequent development of the concept formally has added to its popularity. The popularity of code refactoring is demonstrated by the popularity of related books on the subject such as Refactoring: Improving the Design of Existing Code and Refactoring to Patterns. The concept of code refactoring is so popular that it has even spun off related new topics. The book Prefactoring is an example of this phenomenon.

Martin Fowler defines refactoring in multiple sources including in an interview and on the main page of Refactoring.com:

Refactoring is a disciplined technique for restructuring an existing body of code, altering its internal structure without changing its external behavior.


When initially looking at large, complex code that I am not very familiar with, I often try to understand it all at once in an effort to refactor the entire thing quickly. If the code is unfamiliar enough, large enough, and/or complex enough, this rarely works well. I have found that I am much more successful refactoring large, complex, and unknown code bases when I start with small and easy refactorings. This may seem like common sense (in many ways it is), but in the rush to refactor, I still sometimes try to get ahead of myself.

When writing a book, paper, or article, an author often finds one of the most difficult aspects to be just getting started. This can apply to beginning refactoring of others' complex and unwieldy code. Sometimes it is best to dig in and start refactoring, even if it is small, simple refactoring.

I have found that refactoring small, simple things has helped me refactor unfamiliar complex code for several reasons. First, even simple refactoring often makes the code easier to further refactor because a little complexity is removed during each stage of the refactoring. For example, as code becomes more modular, it is easier to see repeated pieces of code and modularize even further. Another example is that removal of dead code often simplifies reading of the remaining code. It's a very satisfying feeling to see difficult code become significantly more readable. A second reason that I believe that refactoring small and simple pieces at first helps is that the act of refactoring gets me into the previously unfamiliar code so that it becomes increasingly familiar.

Finally, I think that a third reason that these initial, simple steps in refactoring are beneficial is because they are implemented when preconceived notions may be less likely to affect one's thinking. Although refactoring is often performed because new or increased knowledge allows us to improve things, the opposite can also happen: one can be so familiar with what the code is supposed to do that it is less obvious that the code needs refactoring. A fresh set of eyes, which are freshest when first looking at and starting to refactor unfamiliar code, may lead to cleaner code that is more readable for anyone new to it. We can return to the analogy of writing prose again to illustrate this. It is sometimes easier for another person to find issues with one's writing than for the author to find because the author knows what he or she means to write and implicitly fills in any gaps with that knowledge. This is the key principle behind code and writing reviews. In Extreme Perl: Chapter 14: Refactoring, the author points out that another set of eyes can be helpful in refactoring unknown complex code.


Conclusion

Although, by definition, refactoring does not change the functionality of the code, I have found refactoring to be extremely valuable in improving non-functional aspects of my code such as improved maintainability, improved readability, and sometimes even better performance. I can achieve these same benefits by refactoring others' code, but that can often be more difficult than refactoring my own code. However, I have found that perhaps the most important step when refactoring unfamiliar and complex code is to dive right in first with simple, obvious refactoring.


Addition Reference: Once-A-Day Refactoring

Sean Chambers is currently posting on a daily basis on refactoring. If you're new to refactoring or would like to read in-depth coverage of various refactoring techniques, you'll likely find these Refactoring Day XX blog posts to be helpful.

No comments: