Refactoring Guide

Refactoring

Excerpt

Design Patterns and Refactoring articles and guides. Design Patterns video tutorials for newbies. Simple descriptions and full source code examples in Java, C++, C#, PHP and Delphi.


Much of refactoring is devoted to correctly composing methods. In most cases, excessively long methods are the root of all evil. The vagaries of code inside these methods conceal the execution logic and make the method extremely hard to understand – and even harder to change. The refactoring techniques in this group streamline methods, remove code duplication, and pave the way for future improvements. Bloaters


Bloaters are code, methods and classes that have increased to such gargantuan proportions that they are hard to work with. Usually these smells do not crop up right away, rather they accumulate over time as the program evolves (and especially when nobody makes an effort to eradicate them). Long Method A method contains too many lines of code. Generally, any method longer than ten lines should make you start asking questions. Large Class A class contains many fields/methods/lines of code. Primitive Obsession

  • Use of primitives instead of small objects for simple tasks (such as currency, ranges, special strings for phone numbers, etc.)

  • Use of constants for coding information (such as a constant USER_ADMIN_ROLE = 1 for referring to users with administrator rights.)

  • Use of string constants as field names for use in data arrays. Long Parameter List More than three or four parameters for a method. Data Clumps Sometimes different parts of the code contain identical groups of variables (such as parameters for connecting to a database). These clumps should be turned into their own classes.

Long Method

Signs and Symptoms

Reasons for the Problem

Like the Hotel California, something is always being added to a method but nothing is ever taken out. Since it is easier to write code than to read it, this "smell" remains unnoticed until the method turns into an ugly, oversized beast. Mentally, it is often harder to create a new method than to add to an existing one: "But it's just two lines, there's no use in creating a whole method just for that..." Which means that another line is added and then yet another, giving birth to a tangle of spaghetti code.

Treatment

Payoff

  • Among all types of object oriented code, classes with short methods live longest. The longer a method or function is, the harder it becomes to understand and maintain it.

Performance

Does an increase in the number of methods hurt performance, as many people claim? In almost all cases the impact is so negligible that it's not even worth worrying about. Plus, now that you have clear and understandable code, you are more likely to find truly effective methods for restructuring code and getting real performance gains if the need ever arises.

Large Class#

Signs and Symptoms

Reasons for the Problem

Classes usually start small. But over time, they get bloated as the program grows. As is the case with long methods as well, programmers usually find it mentally less taxing to place a new feature in an existing class than to create a new class for the feature.

Treatment

  • Extract Class helps if part of the behavior of the large class can be spun off into a separate component.

  • Extract Subclass helps if part of the behavior of the large class can be implemented in different ways or is used in rare cases.

  • Extract Interface helps if it is necessary to have a list of the operations and behaviors that the client can use.

  • If a large class is responsible for the graphical interface, you may try to move some of its data and behavior to a separate domain object. In doing so, it may be necessary to store copies of some data in two places and keep the data consistent. Duplicate Observed Data offers a way to do this.

Payoff

  • Refactoring of these classes spares developers from needing to remember a large number of attributes for a class.

Signs and Symptoms

  • Use of primitives instead of small objects for simple tasks (such as currency, ranges, special strings for phone numbers, etc.)

  • Use of constants for coding information (such as a constant USER_ADMIN_ROLE = 1 for referring to users with administrator rights.)

Reasons for the Problem

Like most other smells, primitive obsessions are born in moments of weakness. "Just a field for storing some data!" the programmer said. Creating a primitive field is so much easier than making a whole new class, right? And so it was done. Then another field was needed and added in the same way. Lo and behold, the class became huge and unwieldy. Primitives are often used to "simulate" types. So instead of a separate data type, you have a set of numbers or strings that form the list of allowable values for some entity. Easy-to-understand names are then given to these specific numbers and strings via constants, which is why they are spread wide and far. Another example of poor primitive use is field simulation. The class contains a large array of diverse data and string constants (which are specified in the class) are used as array indices for getting this data.

Treatment

Payoff

  • Code becomes more flexible thanks to use of objects instead of primitives.

  • Better understandability and organization of code. Operations on particular data are in the same place, instead of being scattered. No more guessing about the reason for all these strange constants and why they are in an array.

Signs and Symptoms

Reasons for the Problem

A long list of parameters might happen after several types of algorithms are merged in a single method. A long list may have been created to control which algorithm will be run and how. Long parameter lists may also be the byproduct of efforts to make classes more independent of each other. For example, the code for creating specific objects needed in a method was moved from the method to the code for calling the method, but the created objects are passed to the method as parameters. Thus the original class no longer knows about the relationships between objects, and dependency has decreased. But if several of these objects are created, each of them will require its own parameter, which means a longer parameter list. It is hard to understand such lists, which become contradictory and hard to use as they grow longer. Instead of a long list of parameters, a method can use the data of its own object. If the current object does not contain all necessary data, another object (which will get the necessary data) can be passed as a method parameter.

Treatment

  • Instead of passing a group of data received from another object as parameters, pass the object itself to the method, by using Preserve Whole Object.

  • If there are several unrelated data elements, sometimes you can merge them into a single parameter object via Introduce Parameter Object.

Payoff

  • More readable, shorter code.

  • Refactoring may reveal previously unnoticed duplicate code.

When to Ignore

  • Do not get rid of parameters if doing so would cause unwanted dependency between classes.

Data Clumps#

Signs and Symptoms

Reasons for the Problem

Often these data groups are due to poor program structure or "copypasta programming". If you want to make sure whether or not some data is a data clump, just delete one of the data values and see whether the other values still make sense. If this is not the case, this is a good sign that this group of variables should be combined into an object.

Treatment

  • If repeating data comprises the fields of a class, use Extract Class to move the fields to their own class.

  • If the same data clumps are passed in the parameters of methods, use Introduce Parameter Object to set them off as a class.

  • If some of the data is passed to other methods, think about passing the entire data object to the method instead of just individual fields. Preserve Whole Object will help with this.

  • Look at the code used by these fields. It may be a good idea to move this code to a data class.

Payoff

  • Improves understanding and organization of code. Operations on particular data are now gathered in a single place, instead of haphazardly throughout the code.

When to Ignore

  • Passing an entire object in the parameters of a method, instead of passing just its values (primitive types), may create an undesirable dependency between the two classes. Object-Orientation Abusers



All these smells are incomplete or incorrect application of object-oriented programming principles. Switch Statements You have a complex switch operator or sequence of if statements. Temporary Field Temporary fields get their values (and thus are needed by objects) only under certain circumstances. Outside of these circumstances, they are empty. Refused Bequest If a subclass uses only some of the methods and properties inherited from its parents, the hierarchy is off-kilter. The unneeded methods may simply go unused or be redefined and give off exceptions. Alternative Classes with Different Interfaces Two classes perform identical functions but have different method names.

Switch Statements

Signs and Symptoms

Reasons for the Problem

Relatively rare use of switch and case operators is one of the hallmarks of object-oriented code. Often code for a single switch can be scattered in different places in the program. When a new condition is added, you have to find all the switch code and modify it. As a rule of thumb, when you see switch you should think of polymorphism.

Treatment

Payoff

When to Ignore

  • When a switch operator performs simple actions, there is no reason to make code changes.

  • Often switch operators are used by factory design patterns (Factory Method and Abstract Factory) to select a created class.

Temporary Field

Signs and Symptoms

Reasons for the Problem

Oftentimes, temporary fields are created for use in an algorithm that requires a large amount of inputs. So instead of creating a large number of parameters in the method, the programmer decides to create fields for this data in the class. These fields are used only in the algorithm and go unused the rest of the time. This kind of code is tough to understand. You expect to see data in object fields but for some reason they are almost always empty.

Treatment

  • Introduce Null Object and integrate it in place of the conditional code which was used to check the temporary field values for existence.

Payoff

Refused Bequest#

Signs and Symptoms

Reasons for the Problem

Someone was motivated to create inheritance between classes only by the desire to reuse the code in a superclass. But the superclass and subclass are completely different.

Treatment

  • If inheritance is appropriate, get rid of unneeded fields and methods in the subclass. Extract all fields and methods needed by the subclass from the parent class, put them in a new subclass, and set both classes to inherit from it (Extract Superclass).

Payoff


Signs and Symptoms

Reasons for the Problem

The programmer who created one of the classes probably didn't know that a functionally equivalent class already existed.

Treatment

Try to put the interface of classes in terms of a common denominator:

  • Rename Methods to make them identical in all alternative classes.

  • Move Method, Add Parameter and Parameterize Method to make the signature and implementation of methods the same.

  • If only part of the functionality of the classes is duplicated, try using Extract Superclass. In this case, the existing classes will become subclasses.

  • After you have determined which treatment method to use and implemented it, you may be able to delete one of the classes.

Payoff

  • You get rid of unnecessary duplicated code, making the resulting code less bulky.

When to Ignore

  • Sometimes merging classes is impossible or so difficult as to be pointless. One example is when the alternative classes are in different libraries that each have their own version of the class.c

Change Preventers#



These smells mean that if you need to change something in one place in your code, you have to make many changes in other places too. Program development becomes much more complicated and expensive as a result. Divergent Change You find yourself having to change many unrelated methods when you make changes to a class. For example, when adding a new product type you have to change the methods for finding, displaying, and ordering products. Shotgun Surgery Making any modifications requires that you make many small changes to many different classes. Parallel Inheritance Hierarchies Whenever you create a subclass for a class, you find yourself needing to create a subclass for another class.

Divergent Change#


Divergent Change resembles Shotgun Surgery but is actually the opposite smell. Divergent Change is when many changes are made to a single class. Shotgun Surgery refers to when a single change is made to multiple classes simultaneously.

Signs and Symptoms

Reasons for the Problem

Often these divergent modifications are due to poor program structure or "copypasta programming".

Treatment

Payoff

  • Improves code organization.

  • Reduces code duplication.

  • Simplifies support.

Shotgun Surgery

Shotgun Surgery resembles Divergent Change but is actually the opposite smell. Divergent Change is when many changes are made to a single class. Shotgun Surgery refers to when a single change is made to multiple classes simultaneously.

Signs and Symptoms

Reasons for the Problem

A single responsibility has been split up among a large number of classes. This can happen after overzealous application of Divergent Change.

Treatment

  • If moving code to the same class leaves the original classes almost empty, try to get rid of these now-redundant classes via Inline Class.

Payoff

  • Better organization.

  • Less code duplication.

Signs and Symptoms

Reasons for the Problem

All was well as long as the hierarchy stayed small. But with new classes being added, making changes has become harder and harder.

Treatment

  • You may de-duplicate parallel class hierarchies in two steps. First, make instances of one hierarchy refer to instances of another hierarchy. Then, remove the hierarchy in the referred class, by using Move Method and Move Field.

Payoff

  • Reduces code duplication.

  • Can improve organization of code.

When to Ignore

  • Sometimes having parallel class hierarchies is just a way to avoid even bigger mess with program architecture. If you find that your attempts to de-duplicate hierarchies produce even uglier code, just step out, revert all of your changes and get used to that code. Dispensables

A dispensable is something pointless and unneeded whose absence would make the code cleaner, more efficient and easier to understand. Comments A method is filled with explanatory comments. Duplicate Code Two code fragments look almost identical. Lazy Class Understanding and maintaining classes always costs time and money. So if a class doesn't do enough to earn your attention, it should be deleted. Data Class A data class refers to a class that contains only fields and crude methods for accessing them (getters and setters). These are simply containers for data used by other classes. These classes do not contain any additional functionality and cannot independently operate on the data that they own. Dead Code A variable, parameter, field, method or class is no longer used (usually because it is obsolete). Speculative Generality There is an unused class, method, field or parameter. Comments

Signs and Symptoms

Reasons for the Problem

Comments are usually created with the best of intentions, when the author realizes that his or her code is not intuitive or obvious. In such cases, comments are like a deodorant masking the smell of fishy code that could be improved.

The best comment is a good name for a method or class. If you feel that a code fragment cannot be understood without comments, try to change the code structure in a way that makes comments unnecessary.

Treatment

  • If a comment is intended to explain a complex expression, the expression should be split into understandable subexpressions using Extract Variable.

  • If a comment explains a section of code, this section can be turned into a separate method via Extract Method. The name of the new method can be taken from the comment text itself, most likely.

  • If a method has already been extracted, but comments are still necessary to explain what the method does, give the method a self-explanatory name. Use Rename Method for this.

  • If you need to assert rules about a state that is necessary for the system to work, use Introduce Assertion.

Payoff

  • Code becomes more intuitive and obvious.

When to Ignore

Comments can sometimes be useful:

  • When explaining why something is being implemented in a particular way.

  • When explaining complex algorithms (when all other methods for simplifying the algorithm have been tried and come up short). Duplicate Code


Signs and Symptoms

Reasons for the Problem

Duplication usually occurs when multiple programmers are working on different parts of the same program at the same time. Since they are working on different tasks, they may be unaware their colleague has already written similar code that could be repurposed for their own needs. There is also more subtle duplication, when specific parts of code look different but actually perform the same job. This kind of duplication can be hard to find and fix. Sometimes duplication is purposeful. When rushing to meet deadlines and the existing code is "almost right" for the job, novice programmers may not be able to resist the temptation of copying and pasting the relevant code. And in some cases, the programmer is simply too lazy to de-clutter.

Treatment

  • If the same code is found in two subclasses of the same level:

  • If duplicate code is found in two different classes:

    • If the classes are not part of a hierarchy, use Extract Superclass in order to create a single superclass for these classes that maintains all the previous functionality.

    • If it is difficult or impossible to create a superclass, use Extract Class in one class and use the new component in the other.

  • If a large number of conditional expressions are present and perform the same code (differing only in their conditions), merge these operators into a single condition using Consolidate Conditional Expression and use Extract Method to place the condition in a separate method with an easy-to-understand name.

  • If the same code is performed in all branches of a conditional expression: place the identical code outside of the condition tree by using Consolidate Duplicate Conditional Fragments.

Payoff

  • Merging duplicate code simplifies the structure of your code and makes it shorter.

  • Simplification + shortness = code that is easier to simplify and cheaper to support.

When to Ignore

  • In very rare cases, merging two identical fragments of code can make the code less intuitive and obvious. Lazy Class

Signs and Symptoms

Reasons for the Problem

Perhaps a class was designed to be fully functional but after some of the refactoring it has become ridiculously small. Or perhaps it was designed to support future development work that never got done.

Treatment

Payoff

  • Reduced code size.

  • Easier maintenance.

When to Ignore

  • Sometimes a Lazy Class is created in order to delineate intentions for future development, In this case, try to maintain a balance between clarity and simplicity in your code.

Data Class

Signs and Symptoms

Reasons for the Problem

It's a normal thing when a newly created class contains only a few public fields (and maybe even a handful of getters/setters). But the true power of objects is that they can contain behavior types or operations on their data.

Treatment

  • If a class contains public fields, use Encapsulate Field to hide them from direct access and require that access be performed via getters and setters only.

  • Use Encapsulate Collection for data stored in collections (such as arrays).

  • After the class has been filled with well thought-out methods, you may want to get rid of old methods for data access that give overly broad access to the class data. For this, Remove Setting Method and Hide Method may be helpful.

Payoff

  • Improves understanding and organization of code. Operations on particular data are now gathered in a single place, instead of haphazardly throughout the code.

  • Helps you to spot duplication of client code.

Dead Code



Signs and Symptoms

A variable, parameter, field, method or class is no longer used (usually because it is obsolete).

Reasons for the Problem

When requirements for the software have changed or corrections have been made, nobody had time to clean up the old code.

Such code could also be found in complex conditionals, when one of the branches becomes unreachable (due to error or other circumstances).

Treatment

The quickest way to find dead code is to use a good IDE.

  • Delete unused code and unneeded files.

Payoff

  • Reduced code size.

  • Simpler support.

Speculative Generality

Signs and Symptoms

There is an unused class, method, field or parameter.

Reasons for the Problem

Sometimes code is created "just in case" to support anticipated future features that never get implemented. As a result, code becomes hard to understand and support.

Treatment

  • Unnecessary delegation of functionality to another class can be eliminated via Inline Class.

  • Unused methods? Use Inline Method to get rid of them.

  • Methods with unused parameters should be given a look with the help of Remove Parameter.

  • Unused fields can be simply deleted.

Payoff

  • Slimmer code.

  • Easier support.

When to Ignore

  • If you are working on a framework, it is eminently reasonable to create functionality not used in the framework itself, as long as the functionality is needed by the frameworks's users.

  • Before deleting elements, make sure that they are not used in unit tests. This happens if tests need a way to get certain internal information from a class or perform special testing-related actions.

Last updated