The Nature of Complexity
6th August 2023
In Software Engineering and related disciplines, one often has to deal with some code or infrastructure which is considered "a mess" or "spaghetti code". But I have always been unsatisfied with the implication that the person who created it was just an idiot or "shit engineer". I'm sure in most cases the person writing the code or setting up the infrastructure had best intentions and probably even thought they were following best practice at the time.
It seems to me that the core problem is just complexity. Somehow they have created something complex which is intractable or difficult to reason about.
But I think it's worth diving into what complexity actually is. The Cambridge Dictionary defines it as "involving a lot of different but related parts". I think this isn't bad, but perhaps a bit vague.
I would propose this definition: "composed of multiple parts which can neither be fully separated nor fully integrated together". Let's explore what that means.
What is not complex
Let's get out of the world of software for a minute. Instead of a sequence of instructions for a computer to follow, we write a sequence of instructions for people to build something. We will explore three scenarios and determine if they are complex and why.
Scenario 1: there are 5 tasks to be completed. These can be done in any order or all at the same time.
Scenario 2: there are 5 tasks to be completed. These must be done in the following order: a, b, c, d, e
Scenario 3: there are 5 tasks to be completed. Task "a" can be done at the same time as tasks "b" or "c" but must be finished before "e". Task "b" cannot be done at the same time as "c". Task "e" cannot be done on the same day as other tasks and must be done the day after task "b". Task "d" must be started before "b" but cannot be completed until after task "e".
Can you tell which one is complex?
I actually think that none of them are complex. Scenario 1 is not complex because all the tasks can be treated individually. In other words you can separate them all out.
Scenario 2 is not complex because it is an easily follow-able sequence. You can easily plan what workers and materials you need on each day as you work, and easily predict how long it will take by just adding the times each task takes all together.
Scenario 3 superficially appears to be complex as the relationships between the tasks do not appear to be strait-forward. However it is actually just a logic puzzle which one can figure-out, and convert it into a linear plan just like scenario 2.
What is complex?
I hope we have established that, if our set of tasks can be converted into a linear list, then it is not complex. This is because we can integrate them together into a cohesive plan.
Complexity requires our components to somehow resist being integrated together. What analogy can we find for this?
I think the best real-world analogy is if there are multiple separate companies working together on a project. For example if one were building a bridge in a city, lots of different private and government organisations would need to work together to make it happen. Each of these organisations is effectively inscrutable to each-other, as one cannot see the internal processes of another organisation. Furthermore each has their own independent motives and priorities.
Also each organisation is trying to achieve completely different goals, For example:
- The city planner is interested in optimising traffic flow
- The architect wants to design his most beautiful bridge of his career
- The parties paying for it want it to be cheap to build
- Those using the waterway want to ensure boats can still traverse during and after construction
I won't strain the analogy further with my limited understanding of large construction projects. But imagine trying to predict how long it will take or how much it will cost with any certainty. Imagine trying to make a plan and adhere to it.
In-fact, in this case it's probably impossible to make a plan and adhere to it. Rather one has to continuously re-evaluate one's plans as one works.
Conclusion
I think the sort of complexity in our hypothetical bridge-building project is exactly the kind of complexity one deals with all the time in software engineering.
This is because software components contain so much information that they are effectively inscrutable, because it takes an unreasonable amount of time to study them. When I say "software components", that could be anything from a microservice, to a code library, to even a class in OO programming.
This means a software engineer needs to be good at working with components one doesn't fully understand.
Take-aways
Given my proposition "a software engineer needs to be good at working with components one doesn't fully understand", how can that affect software design?
I think it means we need to design software components and documentation in such a manner that:
- The engineer does not need to understand it's inner-workings to use this component or connect other components with it.
- The information an engineer does need to know to use it should be well documented.
I think you'll find that whenever people are complaining about some awful code, it doesn't follow those simple principles.
And note that a "modular design" is pointless if one cannot interract with a module without understanding the detail of how it works.