🧹 You Should Refactor More Than You Think

It often happens that small improvements or quick wins are ignored for a very long time, taking up space in the backlog or mental energy. You know, those TODOs that are left for years in the codebase even though fixing it would probably take an hour. Next to the small and ‘unimportant’ changes, coding standards and habits change over time and the more code you write, the more experienced you get. These different aspects will have an impact on your coding style.

All of this results in a codebase that is never ‘perfect’ and it is only a matter of time before some parts become outdated.

This means that refactoring code will always be a continuous effort, there is no such thing as finished code. This maintenance cost is often overlooked: there’s a high focus on the cost of making something, but the implicit cost linked to every single line of code that is already in the codebase is forgotten.

Every codebase has those legacy parts that nobody dares to touch in case improvements would break something or pieces of code that are not used anymore, but aren’t thrown away either. Both of them have an ever-increasing cost linked to them.

An interesting opinion on this can be found in this paper on Software Engineering at Google. One of the topics discussed is:

Most software at Google gets rewritten every few years.

To some, this probably sounds crazy, but it’s an incredibly interesting approach that probably more companies should take.

I must say that I am guilty of not refactoring enough as well sometimes, so today’s post is going to focus purely on refactoring and quick wins. Not that it’s that big of an issue in my codebase, but this post was triggered by some refactoring that I finally took care of. It’s not the most interesting topic, but it’s an important part of Software Engineering and definitely shouldn’t be overlooked.

The topics discussed in this post will be:

Modernising code
Moving code
Naming
Lazy code

Modernising code

The previous post about ⭐ Procedural Content Generation - Stars, described a function used to determine the colour of a star from its temperature. My original implementation wasn’t written with newer conventions in mind. Even from this small piece of code, it’s possible to find the following improvements:

Using auto
Not using unsigned to avoid negative values (the C++ Core Guidelines explicitly advice against this)
Using C++-style casts
Using trailing return types

The original implementation:

#pragma once
 
#include <glm/glm.hpp>
 
namespace color {
    (...)
 
    /**
     * @param[temperature] Kelvin
     */
    glm::vec3 starColor(unsigned int temperature);
 
    float calculateTemperatureIndicator(unsigned int temperature);
}

#include "color.hpp"
 
#include <algorithm>
#include <math.h>
 
#include "util.hpp"
 
namespace color {
    (...)
 
    glm::vec3 starColor(unsigned int temperature) {
        // Based on
        // https://stackoverflow.com/questions/21977786/star-b-v-color-index-to-apparent-rgb-color
 
        float rawTemperatureIndicator = calculateTemperatureIndicator(temperature);
        float temperatureIndicator    = std::clamp(rawTemperatureIndicator, -0.4f, 2.0f);
 
        glm::vec3 result = glm::vec3();
        double t;
 
        (...)
 
        return result;
    }
 
    float calculateTemperatureIndicator(unsigned int temperature) {
        float floatTemperature = (float)temperature;
 
        return (-2.1344 * floatTemperature + 8464 +
                sqrt(0.98724096 * floatTemperature * floatTemperature + 71639296)) /
               (1.6928 * floatTemperature);
    }
 
} // namespace color

The new implementation:

#pragma once
 
#include <glm/glm.hpp>
 
namespace Engine {
    class Color final {
      public:
        (...)
 
        /**
         * @param[temperature] Kelvin
         */
        static auto starColor(int temperature) -> glm::vec3;
 
      private:
        static auto calculateTemperatureIndicator(int temperature) -> float;
    };
}

#include "color.hpp"
 
#include <algorithm>
#include <cassert>
#include <math.h>
 
#include "util.hpp"
 
namespace Engine {
    (...)
 
    auto Color::starColor(int temperature) -> glm::vec3 {
        // Based on
        // https://stackoverflow.com/questions/21977786/star-b-v-color-index-to-apparent-rgb-color
        assert(temperature > 0);
 
        auto rawTemperatureIndicator = calculateTemperatureIndicator(temperature);
        auto temperatureIndicator    = std::clamp(rawTemperatureIndicator, -0.4f, 2.0f);
 
        glm::vec3 result;
        double t;
        
        (...)
 
        return result;
    }
 
    auto Color::calculateTemperatureIndicator(int temperature) -> float {
        assert(temperature > 0);
 
        auto floatTemperature = static_cast<float>(temperature);
 
        return (-2.1344 * floatTemperature + 8464 +
                sqrt(0.98724096 * floatTemperature * floatTemperature + 71639296)) /
               (1.6928 * floatTemperature);
    }
}

From time to time, it’s required to update any code you come across throughout the codebase. The best way to detect where updates are required is by using static code analysis tools. It’s not manageable to check every file every once in a while to see if an update is required.

Moving code

I try to adhere to two rules when deciding where to put a piece of code:

It should not be accessible by other code that doesn’t need it
It should be as close as possible to the place it is used (unless it is used in multiple different locations)

This might sound very obvious, but for some reason, it is often very easy to find examples where these rules aren’t followed. Two examples from the top of my head:

A function in a class was public, and all its users have been refactored, but during the refactoring, it was not updated to a private function
Implement concepts in a library “because it might be useful to others”

The two changes in my codebase related to these examples are: moving a class to an inner class and moving a function closer to the data it uses.

Inner classes

The Vertex struct is used in Models to represent the vertices in the model. Vertex was publicly accessible, which means other classes can use it as well. This is very useful if there are multiple users of the same concept, but in this case, Model is the only user and its implementation is dependent on the exact layout of Vertex for sending the data to the graphics using OpenGL. Once the Vertex struct is a private inner struct, the Model class can be certain that the implementation won’t break because someone makes a change to Vertex for whatever reason. You still can break something of course, but with a much lower chance.

#pragma once
 
(...)
 
struct Vertex {
    glm::vec3 position;
    glm::vec3 normal;
    glm::vec3 color;
};
 
namespace Engine::Components {
    class Model : public Component {
      public:
        (...)
 
      protected:
        (...)
        std::vector<Vertex> m_vertices;
 
      private:
        (...)
    };
}

#pragma once
 
(...)
 
namespace Engine::Components {
    class Model : public Component {
      public:
        (...)
 
      protected:
        (...)
        struct Vertex {
            glm::vec3 position;
            glm::vec3 normal;
            glm::vec3 color;
        };
      
        std::vector<Vertex> m_vertices;
 
      private:
        (...)
    };
}

Moving functions to the data

One of the main points of classes is to group data and functions together to form an entity. Sometimes this is done incorrectly or some other refactoring made it easier to do that elegantly. In this example, it’s the latter.

As an example, I have some GUI code to manipulate the transform of an object in a window that is global to the scene. Of course, this means the window needs access to the members. Because it can update the members, it even needs a non-const pointer, which is something you don’t want to expose normally through a getter. In the original code, this was done by making the Model Scene a friend of the Transform, but that is not a good way of solving it.

Now that a lot of code is being refactored to components, this is suddenly much easier, because each component can manage its GUI.

How to not do it:

#include "modelscene.hpp"
 
#include "engine/components/gui/component_gui.hpp"
#include "engine/components/gui/star_configuration_gui.hpp"
#include "engine/components/gui_component.hpp"
#include "engine/components/script.hpp"
#include "engine/components/scripts/demo_rotation.hpp"
#include "engine/objects/background.hpp"
#include "engine/objects/entity.hpp"
#include "engine/objects/planet.hpp"
#include "engine/objects/star.hpp"
 
#include "imgui.h"
 
#include <limits>
 
namespace ModelViewer {
    (...)
 
    auto ModelScene::renderGui() -> void {
        (...)
 
        unsigned short i = 0;
        for (auto& object : m_objects) {
            object->visit([this, &i](Engine::Entity& element) {
                ImGui::Indent(element.getDepth() * 8.0f);
                if (ImGui::Selectable(element.getName().c_str(), i == m_selectedObject)) {
                    if (i != m_selectedObject) {
                        m_selectedObjectPointer = &element;
                    }
                    m_selectedObject = i;
                }
                ImGui::Unindent(element.getDepth() * 8.0f);
 
                // Increment in lambda to increment per object (outside of lambda is per root
                // object)
                i++;
            });
        }
 
        if (m_selectedObjectPointer) {
            auto& transform = m_selectedObjectPointer->getRequired<Engine::Components::Transform>();
 
            ImGui::Separator();
            ImGui::Text("Details of selected Model");
            ImGui::Text("Relative Position");
            ImGui::DragFloat3("##positionDrags", &transform.m_position[0], 0.05f);
            ImGui::Text("Rotation");
            ImGui::DragFloat3("##rotationDrags", &transform.m_rotationYXZ[0], 0.05f);
            ImGui::Text("Scale");
            ImGui::DragFloat3("##scaleDrags",
                              &transform.m_scale[0],
                              0.05f,
                              0.0f,                               // min
                              std::numeric_limits<float>::max()); // max (required, otherwise min is
                                                                  // ignored)
        }
 
        ImGui::End();
    }
 
    auto ModelScene::createModel(const char* model) -> void {
        m_objects.clear();
        m_objects.push_back(Engine::Background::createDefault(m_shaderRegistry));
 
        auto object = std::unique_ptr<Engine::Entity>{nullptr};
 
        (...)
 
        if (object) {
            m_selectedObjectPointer = object.get();
            m_objects.push_back(std::move(object));
        }
    }
 
    auto ModelScene::resetSelectedObject() -> void {
        m_selectedObject        = std::numeric_limits<decltype(m_selectedObject)>::max();
        m_selectedObjectPointer = nullptr;
    }
}

Especially tracking the selected object in m_selectedObjectPointer was annoying in my opinion.

This code has been moved here, which is much cleaner of course.

#pragma once
 
#include "engine/components/component.hpp"
 
#include <glm/glm.hpp>
 
namespace ModelViewer {
    class ModelScene;
}
 
class ShaderRegistry;
 
namespace Engine::Components {
    class Transform : public Engine::Components::Component {
      public:
        (...)
 
        auto renderConfiguration() -> void override;
 
      private:
        (...)
    };
}

#include "engine/components/transform.hpp"
 
#include "engine/objects/entity.hpp"
#include "engine/shaders/shaderregistry.hpp"
 
#include "imgui.h"
 
#define GLM_ENABLE_EXPERIMENTAL
#include <glm/gtx/euler_angles.hpp>
#include <glm/gtx/transform.hpp>
 
namespace Engine::Components {
    (...)
 
    auto Transform::renderConfiguration() -> void {
        ImGui::Text("Transform");
 
        ImGui::Separator();
        ImGui::Text("Details of selected Model");
        ImGui::Text("Relative Position");
        ImGui::DragFloat3("##positionDrags", &m_position[0], 0.05f);
        ImGui::Text("Rotation");
        ImGui::DragFloat3("##rotationDrags", &m_rotationYXZ[0], 0.05f);
        ImGui::Text("Scale");
        ImGui::DragFloat3("##scaleDrags",
                          &m_scale[0],
                          0.05f,
                          0.0f,                               // min
                          std::numeric_limits<float>::max()); // max (required, otherwise min is
                                                              // ignored)
    }
 
    (...)
}

Naming

Naming things is difficult, but it’s an extremely important aspect of programming. A well-chosen name can convey a lot of information and avoid confusion, but it’s important to be concise. Writing long descriptive names is pointless. I start looking at the length of the name to distinguish entities instead of the names themselves. All the useful information is lost.

The example in my codebase is renaming the class Sun to Star. The Sun is the particular star in our solar system, but I’m not working on a simulation of our solar system. Star is simply more accurate in this case.

This is a trivial example, but the reason why I still think it’s important is because other related classes existed that had to be renamed as well (e.g., SunGui, SunShader,…). The risk is that bad names are reused in related concepts, making it more difficult to correct the error once the bad name has been in the codebase for too long.

Lazy code

AKA, code that was written because you didn’t want to implement it the right way. There was this really stupid example of trying to experiment with adding vectors to objects to debug their orientation. Instead of finding a decent solution for adding them, they were simply added to each sphere in the constructor. Instead, now the debug vector component is only added when creating a model in a debug application.

Of course, instead of refactoring these types of bad code, it should have been done the right way from the start. But you can be certain that inevitably things like this will end up in the codebase anyway.

Conclusion

These are some examples of small changes, some of which have been on my backlog for weeks, and in the end, it took almost no time to fix. The mental energy wasted on having these topics on the back of my mind was way too high for the time I spent in the end.

The advice is simple, don’t wait too long with refactoring the code. The effort is often low, but the longer you wait, the higher it will become.

TurtleStoffel