Price of code change

Have you ever been wondering what does software maintenance cost? And how these costs split between separate development phases. In this section we are trying to find answers to this question by formal methods and by deriving several models that describe the different software development phases. Most of the presented models are linear ones, which can in some cases be easily justified, but in other cases we rely on empirical reasoning.
Let's get into it! First of all, a very important thing: software maintenance in our context consists of two types of activities:
- Implementation of new features
- Code improvement (bug-fixes, refactorings, etc.)
In both of the cases an initial version of the software exists. Let's look at the software as something which is continuously changing (and here I mean continuously). We'll consider this time elapsed between changes of source code line as a random variable of exponential distribution, i.e.:
, where
is the parameter according to which the lines of the system are altered. It may depend on many things, like:
- Size of the system
- How frequently are changes required (are there many feature requests?)
- How many developers are working on it
- What kind of processes exist
- What kind of tools are used during the development, etc.
The good thing is, that this can be estimated for any project by using the backlogs of the configuration management system. Don't you have any yet? Damn, you should start using one, instead of spending your time on reading this kind of advanced stuff. For the rest of us, who has a CMS, estimation of
is really easy. Just compute the average amount of time spent between the modifications of your system and don't forget to take into account the overall amount of lines of code as well! If you haven't already noticed, we made some assumptions in the background:
- The modifications of the lines of code happen independently of each other (which is usually not the case)
- The exponential distribution models the changes in the code (it is however, a reasonable assumption)
I think, we could live with the imprecision of the first remark. If one cannot, she would need to analyze co-changes of lines in the system, to make some assumptions under which some lines change together, etc. But, it is definitely not worth of it. At least not at this point. By adopting this model, we implicitly look at the lines of code as subjects of a so called Poisson process, i.e. each line of code undergoes to a number of changes during a period of time and this number is determined by a random variable of Poisson distribution. We could determine the change rate parameter for each source code line separately (by conunting how frequently each line has changed in the past), but for sake of simplicity we'll take a global parameter
which determines the change rate for each line globaly.
OK, so the code is continuously changing, so what? Let's next find out how much does this continuous change costs us. Let for that
denote the cost of altering one line of the source code (adding new lines or deleting existing ones is also considered). This constant also depends on many things, like:
- Price of resources (developers in this case)
- Processes used in development
- General attitude of the team, etc.
This information is also available in the CMS! Another important thing is that this constant is not a constant, as it may vary as time goes by. By continuously updating its value could improve the model. For sake of simplicity we will not denote this dependency on time here. Beside these external things, there are also some internal attributes which influence the cost of changing one line. Maybe the most important one would be the maintainability of the code. Obviously, if the code is less maintainable, it is harder to change it (therefore more expensive). The opposite does also hold. And now comes the question that is always being asked: how to measure maintainability? The answer is simple: we don't have to and we don't want to. The reason is what we have just stated: the maintainability affects the cost of code change. That is all what we need. So, by our assumption maintainability is just a number assigned to the code base which is in correlation with the cost of code change. Although there exist precise definitions, like the Maintainability Index, but no one has ever proved that it is in strict correlation with the maintainability of a system.
At last, we can make our assumption for the cost of development: at any time
asymptotically
lines of code are changed (where
is the number of lines of code at time
, and
is the change rate of one individual line), each change costs
, where
is the maintainability of the system. The overall cost is as follows:
![]() |
The right hand side expresses the rate of cost change at any time t. We get the price of code change for a particular time interval
by the following formula:
![]() |
Next time we'll continue from this point and we'll see why does the "Maintenance costs" statement hold.

![$$<br />
{\cal C}_1[t_0,t_1]=\int_{t_0}^{t_1}{{\cal C}_L N\left(t\right)\lambda \over {\cal M} ( t )}<br />
$$](/sites/default/files/tex/adb6b51e08a5ae2505636bb76046595d989770c2.png)





COMMENTS
yes of course this is very good post and the script of the site is attractive and interesting, you have made good effort to develop and maintain this article, however being the part of cisco study guides i would like to appreciate your efforts.