Refactoring Mindset

Posted on January 14, 2025 • 20 minutes • 4101 words

Table of contents

Brief overview of TaskList kata
Refactoring mindset
Summary

Continuous code improvement is an iterative process. This article focuses on cultivating a “refactoring mindset” – a deliberate and proactive approach to consistently improve your code.

We’ll explore key principles like building safety net, making small, frequent changes, recognizing code smells, learning to defer refactoring tasks, and minimizing bias to maintain a sustainable pace of improvement.

This article takes the TaskList kata, makes minimal modifications to the original kata, and explains “refactoring mindset” while refactoring the code. You can find the code that was refactored here .

Brief overview of TaskList kata

The TaskList kata is a CLI application designed to manage projects and their tasks. It supports functionalities such as adding projects, adding tasks to projects, marking tasks as done or undone, and listing all tasks across all projects. It provides Task and TaskList as its key abstractions. and supports the following commands:

Command	Description
add project foundation	Adds the project named foundation
add task foundation Task1	Adds the task Task1 to the foundation project
check 1	Marks the task with id 1 as done
uncheck 1	Marks the task with id 1 as not done
show	Lists all tasks across all projects

In this article, we won’t be refactoring the entire codebase but will focus on cultivating the mindset behind refactoring. If you’re curious, you can explore the refactored code here . With that, let’s dive into “refactoring mindset.”

Refactoring mindset

A refactoring mindset encourages developers to embrace small and incremental changes. Cultivating this mindset involves recognizing code smells, identifying and leveraging similar patterns within the codebase, learning to defer some refactoring tasks when necessary. It also emphasizes the importance of maintaining a TODO list to track areas for improvement and building a strong safety net, such as robust tests, to ensure changes don’t break code. Furthermore, it’s important to avoid “coding in the brain” by bringing thoughts and ideas in the actual code, and to be mindful of personal biases that may influence coding choices.

Build safety net

Begin refactoring by building a safety net. Once the safety net is in place, make incremental changes to the code and use the safety net to gather feedback. Tests, preferably unit tests, serve as the safety net, ensuring that refactoring remains low-risk. If the component you want to refactor, such as a method, is covered by tests, you can confidently proceed with changes.

But what if no tests exist for the method you want to refactor? Start by adding characterization tests . A characterization test by definition documents the current behavior of the system the exact same way it is running on the production environment.

This assumes the code being refactored is currently running in production. Characterization tests ensure the current behavior is captured accurately before refactoring. Read more about characterization tests here .

Here’s an example of a characterization test for the TaskList's execute method:

@Test
public void executeWithAdditionOfAProjectContainingOneTask() throws Exception {
    StringWriter writer = new StringWriter();
    TaskList taskList = new TaskList(writer);
    taskList.execute("add project caizin");
    taskList.execute("add task caizin Task1");
    taskList.execute("show");

    //We don't know the system at this stage, so let's expect a blank string.
    String expected = " "; 
    assertEquals(expected, writer.toString());
}

Because we do not understand the system (or the execute method) (at least as of now), we can expect the Writer to return a blank String. Let’s run the test and see it fail. When it does, we have found out what the code actually does under the given condition.

org.junit.ComparisonFailure
Expected: " "
Actual: caizin
        [ ] 1: Task1

Now that we know the behavior of the code, we can go ahead and change the test. Remember, at this point, our view of tests is different: they don’t have any moral authority; they just sit there documenting what the system really does.

@Test
public void executeWithAdditionOfAProjectContainingOneTask() throws Exception {
    StringWriter writer = new StringWriter();
    TaskList taskList = new TaskList(writer);
    taskList.execute("add project caizin");
    taskList.execute("add task caizin Task1");
    taskList.execute("show");

    String expected = "caizin\n" + "[ ] 1: Task1" + "\n";
    assertEquals(expected, writer.toString());
}

It is usually difficult to determine the number of characterization tests that you should add before starting to refactor. Honestly, we can go on adding infinite number of such tests. A simple way to determine this is:

Look at the code that is being characterized.
The code itself can give ideas about what it does, and if we have questions, tests are an ideal way of asking them.
At that point, write tests that cover good enough portion of the code that is being characterized.

For the TaskList kata, I started with a couple of characterization tests. There are available here .

Now that we have characterization tests, we can start refactoring.

Make small changes, small commits

Starting small is essential when refactoring. When you are starting to refactor, identify the smallest change that you can make. I think it is important to be patient, and not worry too much about the final state. Each incremental change contributes to improving the code step by step.

Take the TaskList example:

private final Map<String, List<Task>> tasks = new LinkedHashMap<>();

The variable tasks is a bad name (or a misleading name). It actually represents Projects, where every Project has a name and a collection of Tasks. At this stage, the Project abstraction is buried within the Map. The smallest refactoring would be to rename tasks to projects without worrying too much about the domain abstraction Project (or Projects). After renaming, run all the tests to ensure nothing breaks and commit the change.

This small, low-risk refactoring instantly adds clarity to the code. For example, the following method:

private void addProject(String name) {
    tasks.put(name, new ArrayList<>());
}

becomes

private void addProject(String name) {
    projects.put(name, new ArrayList<>());
}

With this change, the method name addProject becomes more intuitive and aligns seamlessly with the projects collection, allowing the domain concept of Projects to emerge more clearly.

Recognize code smells

There are certain smells which are more common than the others. For example, the “bad name” smell is more common than “long method,” and “long method” is more common than “refused bequest.” Being familiar with these common smells makes refactoring significantly easier.

A few common smells would be: Long Method , Long Parameter List , Primitive Obsession , Switch Statements , Duplicate Code , Comments , Data Class and Feature Envy .

Let’s revisit our TaskList example:

private void show() throws IOException {
    for (Map.Entry<String, List<com.codurance.training.tasks.Task>> project : projects.entrySet()) {
        writer.write(project.getKey());
        writer.write("\n");
        for (com.codurance.training.tasks.Task task : project.getValue()) {
            writer.write(String.format("[%c] %d: %s%n", 
                       (task.isDone() ? 'x' : ' '), task.getId(), task.getDescription())
            );
        }
    }
}

The method show lists all tasks across all projects. Let’s see if we can identify some smells in this method.

While the method isn’t long in terms of the number of lines, it could certainly be broken down further. A method like this could be considered a long method because, even though the number of lines is small, it might be hiding other potential smells that are not clearly visible. A method that can be further decomposed is often referred to as a long method .

The show method directly accesses the private fields of the Task class, which violates encapsulation and leads to inappropriate intimacy between classes. This is problematic because a class should have the freedom to change the internal structure (/data-type) of its fields without affecting other parts of the code. In other words, a class should expose behaviors that operate on its internal data rather than allowing external access to its internal state. This aligns with the Tell, Don’t Ask principle: tell the object what to do, don’t ask for its internal state.

Additionally, if we find ourselves working with a data class (a class that contains data but lacks meaningful behavior), the least we can do is restrict the visibility of its getters and setters to minimize access to its fields.

Let’s start small again and extract String.format into a separate method.

private void show() throws IOException {
    for (Map.Entry<String, List<com.codurance.training.tasks.Task>> project : projects.entrySet()) {
        writer.write(project.getKey());
        writer.write("\n");
        for (com.codurance.training.tasks.Task task : project.getValue()) {
            writer.write(format(task));
        }
    }
}

private static String format(Task task) {
    return String.format("[%c] %d: %s%n", 
                         (task.isDone() ? 'x' : ' '), task.getId(), task.getDescription()
           );
}

The format method is currently static, which means it doesn’t belong to the class it’s defined in, since it doesn’t use any of its fields. However, it makes use of the fields of the Task class, so it should be moved to the Task class itself. To do this, we can simply remove the static keyword and then, move it to the Task class.

The Task and TaskList are shown below:

String format() {
    return String.format("[%c] %d: %s%n", (isDone() ? 'x' : ' '), getId(), getDescription());
}

private void show() throws IOException {
    for (Map.Entry<String, List<com.codurance.training.tasks.Task>> project : projects.entrySet()) {
        writer.write(project.getKey());
        writer.write("\n");
        for (com.codurance.training.tasks.Task task : project.getValue()) {
            writer.write(task.format());
        }
    }
}

Please note: I chose not to increase the visibility of the format method. My general approach is to avoid escalating the visibility of classes or methods unless absolutely necessary.

Run all the tests to ensure nothing breaks, add tests for the new format method inside Task and commit the change. Since the format method is now part of the Task class, the getDescription method is no longer needed. Remove it, run the tests again, and commit this change as well.

This refactoring brings a positive side effect: the Task class, which was primarily a data class, is now beginning to contain meaningful behavior. As we continue, we’ll likely be able to eliminate other getter methods, further encapsulating the behavior within the class itself.

Avoid getting sidetracked

It’s important to stay focused on the task at hand during refactoring. However, the world is full of distractions :), making it easy to get sidetracked. In our TaskList example, it is tempting to consider introducing abstractions like Project, reworking the switch case, or even addressing components like IdGenerator.

The previous refactoring focused on improving the show method. This refactoring started to improve the Task class by introducing the format method. Thus, it’s important to stay focused and choose one of the following directions:

Continue improving the show method, or
Explore if the Task class can be further improved, potentially removing the getId method

I’ve decided to stick with improving the show method. The git commit history also shows the path that I took.

We can extract the inner for-loop com.codurance.training.tasks.Task task : project.getValue() in a separate method. The extracted format method is shown below:

private void show() throws IOException {
    for (Map.Entry<String, List<com.codurance.training.tasks.Task>> project : projects.entrySet()) {
        writer.write(project.getKey());
        writer.write("\n");
        List<Task> tasks = project.getValue();
        format(tasks);
    }
}

private void format(List<Task> tasks) throws IOException {
    for (Task task : tasks) {
        writer.write(task.format());
    }
}

The format method works on a List<Task> and uses the writer field from the TaskList class. If an instance of java.io.Writer were passed as a parameter to the format method, the method would no longer depend on the TaskList instance, making it static.

private static void format(List<Task> tasks, Writer writer) throws IOException {
    for (Task task : tasks) {
        writer.write(task.format());
    }
}

This method should be moved to List<Task> because it works on the collection of Task. This smell can be called Incomplete library class or Primitive obsession .

In languages like Kotlin that support extension functions, we could start by defining an extension function on List<Task>. This would allow format() to be written as fun List<Task>.format(): String. We would continue using extensions until the need arises for a higher-level abstraction like Tasks.

Now we have a reason to introduce Tasks abstraction with the format method. Moving the format method into this new class is straightforward, however, we need to ensure that the change is small.

Look at the format method and notice the for-loop over tasks variable. With the new abstraction that we create, we should be able to use the same loop. To do this, our Tasks class should extend from ArrayList<Task>.

The Tasks class is defined as follows:

public class Tasks extends ArrayList<Task> {
    void format(Writer writer) throws IOException {
        for (Task task : this) {
            writer.write(task.format());
        }
    }
}

And the show method:

private void show() throws IOException {
    for (Map.Entry<String, Tasks> project : projects.entrySet()) {
        writer.write(project.getKey());
        writer.write("\n");
        Tasks tasks = project.getValue();
        tasks.format(writer);
    }
}

With this change, we had to replace the List<Task> with Tasks. To give an example,

private final Map<String, List<Task>> projects = new LinkedHashMap<>();

became

private final Map<String, Tasks> projects = new LinkedHashMap<>();

Run the tests and commit the change. Looking at the format method in the Tasks class, we can see that it formats the tasks and also writes the output to an instance of java.io.Writer. The format method should only be responsible for generating and returning a formatted String, while the act of writing should be handled separately. This also aligns with the vocabulary of the format method that we introduced in Recognize Code Smells section.

The refactored code is shown below:

public class Tasks extends ArrayList<Task> {
    String format() {
        return this.stream().map(Task::format).collect(Collectors.joining());
    }
}

The show method:

private void show() throws IOException {
    for (Map.Entry<String, Tasks> project : projects.entrySet()) {
        writer.write(project.getKey());
        writer.write("\n");
        Tasks tasks = project.getValue();
        writer.write(tasks.format());
    }
}

Run the tests, ensure nothing breaks, add tests for the format method in Tasks and commit the change.

Identify similar patterns

It is important to look for similar concepts in the code while refactoring. With the introduction of the Tasks abstraction we can check if there are other places where this abstraction can be applied. The simplest way would be to look for the same pattern: for-loop over a collection of Task. This leads to a loose guideline: a for-loop over a collection of type T, may become a static method which depends only on the collection. This gives us an opportunity to encapsulate the collection .

Let’s see if there is another method where Tasks can be used. Consider the setDone method, it contains the same pattern, for-loop over a collection of Task.

private void setDone(String idString, boolean done) {
    int id = Integer.parseInt(idString);
    for (Map.Entry<String, Tasks> project : projects.entrySet()) {
        for (com.codurance.training.tasks.Task task : project.getValue()) {
            if (task.getId() == id) {
                task.setDone(done);
                return;
            }
        }
    }
    out.printf("Could not find a task with an ID of %d.", id);
    out.println();
}

We can extract the inner for-loop in a separate method toggleTaskWithId.

private void setDone(String idString, boolean done) {
    int id = Integer.parseInt(idString);
    for (Map.Entry<String, Tasks> project : projects.entrySet()) {
        Tasks tasks = project.getValue();
        if (toggleTaskWithId(done, tasks, id)) return;
    }
    out.printf("Could not find a task with an ID of %d.", id);
    out.println();
}

private static boolean toggleTaskWithId(boolean done, Tasks tasks, int id) {
    for (Task task : tasks) {
        if (task.getId() == id) {
            task.setDone(done);
            return true;
        }
    }
    return false;
}

The method toggleTaskWithId becomes static, and we can move it to the Tasks class.

public class Tasks extends ArrayList<Task> {
    boolean toggleTaskWithId(boolean done, int id) {
        for (Task task : this) {
            if (task.getId() == id) {
                task.setDone(done);
                return true;
            }
        }
        return false;
    }
    //not showing the format method
}

The method toggleTaskWithId method takes a boolean parameter. This might tempt us to refactor further by creating two separate methods: one to mark a task as done and another to mark it as undone. However, it’s important to stay focused on the current refactoring goal and avoid getting sidetracked.

Instead, this is a good opportunity to add a TODO note, either directly in the code or in a separate file, as a reminder for a future improvement. Run all the tests to ensure the changes didn’t break any functionality, add specific tests for the toggleTaskWithId method and prepare the commit.

Let’s take a look at the show method again.

private void show() throws IOException {
    for (Map.Entry<String, Tasks> project : projects.entrySet()) {
        writer.write(project.getKey());
        writer.write("\n");
        Tasks tasks = project.getValue();
        writer.write(tasks.format());
    }
}

We are seeing the same pattern again, a for-loop over a collection. We can extract the for-loop into a separate method.

private void show() throws IOException {
    format(projects, writer);
}

private static void format(Map<String, Tasks> projects, Writer writer) throws IOException {
    for (Map.Entry<String, Tasks> project : projects.entrySet()) {
        writer.write(project.getKey());
        writer.write("\n");
        Tasks tasks = project.getValue();
        writer.write(tasks.format());
    }
}

The static method format works on a collection of type Map<String, Tasks> referred to as projects. Thus, the method can be moved to Map or we can encapsulate the Map. To proceed, we will now introduce the abstraction Projects and to keep the refactoring small, we will extend it from LinkedHashMap<String, Tasks>.

The Projects class is shown below:

public class Projects extends LinkedHashMap<String, Tasks> {
    String format() {
        StringBuilder formatted = new StringBuilder();
        for (Map.Entry<String, Tasks> project : this.entrySet()) {
            formatted.append(project.getKey());
            formatted.append("\n");
            formatted.append(project.getValue().format());
        }
        return formatted.toString();
    }
}

The show method and the instantiation of projects variable look like the following:

private final Projects projects = new Projects();

//all it does now is: write the formatted project string to a writer.
private void show() throws IOException {
    this.writer.write(this.projects.format());
}

Run all the tests to ensure the changes didn’t break any functionality, add specific tests for the format method in Projects and prepare the commit. It’s important to note that the Projects abstraction wasn’t introduced as an arbitrary design decision, it was driven by a clear need to encapsulate the collection. A similar refactoring can be applied to the setDone method because it has a for-loop over projects.entrySet().

Learn to defer

It is crucial to not get carried away during refactoring. While it may be tempting to introduce a Project abstraction alongside Projects, doing so can lead to compile time errors. For instance, the updated Projects class might look like this:

public class Projects extends LinkedHashMap<String, Project> {
    ...
}

class Project {
    final String name;
    final Tasks tasks;
    ...
}

The moment we do this, all the other methods start giving compile-time error. For example, consider the following method addTask. The line Tasks projectTasks = projects.get(projectName); will no longer compile because projects.get(..) now returns a Project instead of Tasks.

private void addTask(String projectName, String description) {
    Tasks projectTasks = projects.get(projectName);
    if (projectTasks == null) {
        throw new IllegalArgumentException("Unknown project: " + projectName);
    }
    projectTasks.add(new com.codurance.training.tasks.Task(nextId(), description, false));
}

We can defer this refactoring, add a TODO, and revisit it later when the impact of changing Projects from being a LinkedHashMap<String, Tasks> to LinkedHashMap<String, Project> is localized, meaning: only the methods of the Projects class would be impacted. That would also be the right time to evaluate whether Projects should extend LinkedHashMap or instead use composition. I decided to introduce Project only after I was sure that the change would only impact Projects class.

Maintain a TODO list

It is crucial to maintain a TODO list while refactoring. Preparing a good TODO list has a lot of advantages:

Improved Focus: Concentrating on individual tasks allows maintaining a steady pace and minimizing the feeling of being overwhelmed.
Reduced Anxiety: A well-defined TODO list minimizes anxiety by providing a clear path.
Enhanced Visibility: A detailed TODO list gives a clear view of all pending tasks, ensuring nothing gets overlooked.

I prefer adding refactoring TODOs directly in my code, where I document everything from ideas to potential distractions. These TODO items are removed as they are completed. An example is available in this git commit . At the end of the story, I make it a point to review and ensure no TODO is left unresolved.

Avoid coding in brain

It’s important to avoid “coding in brain,” as it can lead to overthinking and complicating the task at hand. Visualizing code is a natural tendency, but letting that visualization drive the implementation can cause unnecessary complexity. For example, while refactoring TaskList, I encountered the switch statement inside the execute method:

public void execute(String commandLine) throws Exception {
    String[] commandRest = commandLine.split(" ", 2);
    String command = commandRest[0];
    switch (command) {
        case "show":
            show();
            break;
        case "add":
            add(commandRest[1]);
            break;
        case "check":
            check(commandRest[1]);
            break;
        case "uncheck":
            uncheck(commandRest[1]);
            break;
        default:
            throw new IllegalArgumentException("Unknown command: " + command);
    }
}

This is where my imagination took over. Let me summarize my thought process, while I was imagining the code:

I started imagining creating multiple Commands, each implementing a Command interface.
I imagined needing an abstraction (like a factory) to instantiate the correct command based on the command line. This factory would require either a switch or a Map to handle the mapping between the user provided command and the actual instance.
Additionally, I imagined another abstraction to parse the command line, which would need to recognize “add_project” and “add_task” commands, resulting in yet another switch (or if-else). As a result, adding a new command would trigger shotgun surgery , where changes are required in multiple places.

I admit this was too much. This imagination not only increased anxiety, but also disrupted my progress. Ultimately, I decided to take a pause, focus on incremental improvements, and not worry too much about the final code or the potential new smells that might arise.

Avoid bias

Recognizing bias is crucial because it can lead to unnecessary abstractions and overcomplicate the design. By identifying and addressing bias early, one can focus on creating solutions that are simpler, and better aligned with the problem at hand.

Let’s revisit the execute method from Avoid coding in brain . At first glance, it might appear that polymorphism is a natural solution. This approach would involve creating separate command classes such as AddProjectCommand, AddTaskCommand, and ShowCommand, each implementing the execute method that takes Arguments as a parameter.

At first glance, this approach might seem appealing. However, before proceeding, it’s essential to carefully evaluate the following:

Ensure that polymorphic behavior is not being forced unnecessarily
Avoid introducing abstractions like Arguments unless they provide clear value
Decide whether a new instance of each command should be created for every execution or if a single instance should be reused

I prefer adopting a bottom-up approach for implementing polymorphism. To refactor the switch statement, I introduced the ShowCommand, wrote tests for the newly created command, and then integrated it into the switch statement. I repeated the same process for other commands. This gave me enough clarity into the arguments required by the execute method in each command, any duplication across commands, and the potential impact of adding a new command.

public class ShowCommand {
    private final Writer writer;
    private final Projects projects;
    
    ShowCommand(Writer writer, Projects projects) {
        this.writer = writer;
        this.projects = projects;
    }
    void execute() throws IOException {
        this.writer.write(this.projects.format());
    }
}

public void execute(String commandLine) throws Exception {
    String[] commandRest = commandLine.split(" ", 2);
    String command = commandRest[0];
    
    switch (command) {
        case "show":
            new ShowCommand(writer, projects).execute();
            break;
        //not showing the other cases
    }
}

I now find the bottom-up approach for introducing polymorphism highly intuitive, as it avoids forcing polymorphism and offers ample opportunities to evaluate whether the implementation aligns well with polymorphic behavior. The git history from the commit 85cfb2e highlights the step-by-step process followed to introduce polymorphic commands.

This approach has been helpful in evolving the signature of the execute method. For example, here is the execute method of the ShowCommand during early commits:

void execute(int taskId) throws IOException {
    if (projects.toggleTaskWithId(taskId, true)) return;
    writer.write(String.format("Could not find a task with an ID of %d\n", taskId));
}

and the execute method of AddProjectCommand.

void execute(String[] arguments) throws IOException {
    String projectName = arguments[0];
    projects.addProject(projectName);
}

Eventually, I changed the signature of the execute method to accept List<String>. The change is available here . I did not introduce Arguments until it was needed. This git commit shows subtle code duplication and incomplete library class , which resulted in the creation of the Arguments abstraction.

That’s it. I finished my refactoring and the code is available here .

Summary

The article explores the principles, practices, and mental approach required to effectively refactor code. It emphasizes the importance of incremental, focused improvements while avoiding over-engineering or unnecessary abstractions. Key takeaways include:

Build safety net: Ensure that tests, preferably unit tests, are in place before beginning the refactoring process.
Maintain focus: Break down tasks into smaller, manageable chunks. Use tools like TODO lists to capture ideas and distractions for later review.
Avoid coding in brain: Resist the urge to visualize code. Start with small, testable changes instead of trying to anticipate every possible abstraction.
Identify patterns: Look for repetitive patterns, such as loops over collections, getters (or leak of internal state) as potential candidates for refactoring.
Avoid unnecessary abstractions: Be mindful of biases that can lead to over-complicated designs.
Evaluate polymorphism incrementally: Introduce polymorphism step-by-step, I prefer the bottom-up approach to introduce polymorphism.
Document refactoring: Use commit histories and TODOs to keep track of the refactoring journey.