Agile practices have been adopted worldwide. Many companies are proud of being agile. Each of them implements it in a different way, which is good because there is no single way to do it, and it needs to be adapted to each scenario.
From my point of view, too much focus is set on some aspects of Agile while a key one is forgotten. It is nothing new, but if this aspect is not considered, releasing new versions of your software can be a never-ending story. This is what I will tackle in this article.
But before digging into that, I would like to do a small recap about the evolution of software development, starting with Waterfall and the birth of Agile. You may want to skip it if you already know about it.
A Long Time Ago, There Was Waterfall
Many developers who started their careers in the last 10 years have probably only worked in Agile environments. However, Agile was not always there. Waterfall was a common practice when I started my university education back in 2003.
In Waterfall, the applications were often planned as a whole (or at least in big chunks that were difficult to manage). It could take many months before the customer saw some working software.
The development was planned as a project, with a beginning and, theoretically, an end (the end of the company, sometimes). The project was divided into phases similar to these:
Every stage was often carried over by different people. The customer was involved in signing the contract at the beginning, and at the end. If the project finished successfully, the application was deployed to production and often handed over to a different team, which would be in charge of keeping the lights on.
As many of you know, those practices caused many problems. Some of them were:
- Collecting all the requirements, designing the system upfront, and planning the tasks was almost impossible and would take many weeks.
- Handovers at every stage would lead to miscommunication.
- It would take a lot of time to deliver something. Many times, much more than expected. And you know: while there is no software, there is no money.
- It was also difficult to end within budget and scope.
- There was no feedback loop between stages. Many times, there was feedback only at the end. If the feedback was negative, the whole investment failed.
Agile Has Come to Save Us
When I started working, I heard about a new concept: Agile. For a while, I didn’t know what was it about. At that time, I was focused on improving my development skills and didn’t care that much about software development methodologies.
Agile appeared to fix the many problems that Waterfall presented, and it became rapidly adopted. But many took it as a dogma, thinking that adopting some rituals would solve all of their problems. What I mostly heard was a bunch of buzzwords such as: “Scrum,” “eXtreme Programming (XP),” “Sprint,” “Retro,” “Grooming,” and some others (which later I found were more related to Lean) such as “Kanban,” “Work In Progress (WIP).”
Most of those words didn’t tell me a lot on their own. After some time, I became interested in understanding why a lot of companies were embracing these practices, so I started digging into the topic.
In my initial research, I could understand the meaning of some of those words, and I saw a lot of discussions: Should the team do Scrum, Kanban, XP, or other? How long should the team spend per sprint on each of the rituals? What should we talk about during standup? And so on.
I found very superficial information and was missing something. Maybe it was my research skills. I wouldn’t ever be Indiana Jones. Looking backward, I think that it was not only that.
When I read the Agile Manifesto, one of the things I read was: “Individuals and interactions over processes and tools“.
During my research, I had the impression that too much focus was set on processes and rituals. This was the first sign that either Google’s algorithm or some parts of the software industry were focusing on the less important topics.
It is also worth clarifying that some read these principles as “the second part is not important.” However, as I’ve heard many times, the principles are meant to be read as: “even though the second part is important, the first one is more.”
Focusing on the main points of Agile, one of the main differences with Waterfall was the organization of the development in increments: An application was to be divided into small, prioritized chunks. Each chunk was developed within a sprint (regularly 1 or 2 weeks). After each sprint, a functional version of the application should be available for the customers. The customers (or, in some cases, business representatives) were actively involved during the process, so feedback was available early on. You can see how these ideas tackled some of the problems that Waterfall presented by delivering fast and failing fast by having feedback earlier.
To continue with the Manifesto, one of the values says “Welcome changing requirements, even late in development“. In the beginning, I couldn’t understand how that could be possible. I was not sure if that was a good idea. However, after thinking about it and after having different experiences in Agile environments, I consider it one of the most important parts. It is important to consider that changes can involve new features and new concepts for the domain, but also changes in existing behaviour, changes in the market, etc.
So not only was it important to move fast in Agile (deliver early the value that the customers need, fail fast with early feedback), but it was also important to react quickly to changes in the environment. They are not the only important aspect but I want to focus on these 2 in the rest of the article.
So let’s continue in this evolution of software development practices, mentioning some of the improvements that contributed mainly to moving fast.
Move Fast: The Evolution of Software Coding and Shipping Practices
The tools at the time didn’t make it easy to adopt Agile practices, such as frequent deployments. Some practices and technologies appeared in the last 25 years to help teams ship software faster. Some of them appeared as a consequence of Agile, and some of them appeared independently.
I will just briefly mention some of them and their benefits without going into details. I consider that most are broadly known, but if one is new for you, there are many resources online to find out more:
- Test automation gave developers confidence by preventing lots of bugs and reducing the need for debugging.
- Version control helped developers track changes back to when they were added, revert them easily, solve conflicts in a much easier way, and it was the base for many other improvements.
- Feature branches or working in parallel with branches. Parallelism is always good, right? Well, this also presented in many cases a new problem: When doing a large number of changes or keeping branches open for a long time, developers would need to spend a lot of time integrating those changes. This practice doesn’t work well with some of the other practices mentioned next.
- Feature flags. An option if teams need to work on several initiatives in parallel for a long time but want to avoid the conflicts that often appear with long-living branches.
- CI-CD (Continuous Integration-Continuous Delivery/Deployment). By integrating the changes, running the tests, and deploying often, the big barrier of delivering to production is disappearing.
- Code reviews/Pair programming. Years ago, everybody was developing on their own, even critical code. Reviewing code helped us not only code better but also to share knowledge, learn, and reduce the possibility of missing something important. I would argue it is not always needed, but it can provide advantages in comparison to doing big pull requests for long-lived branches, since the feedback is provided while coding, avoiding having to re-write code.
- You build it, you run it. The development team became also responsible for maintaining the application. This makes sense because the team that develops the application can understand it better than anybody else, and it brings several benefits:
- Less time to identify issues.
- Fewer mistakes made when fixing issues.
- In addition, dealing with issues in a production environment brings the most valuable lessons. These can benefit further developments.
- Also, a lot of time is saved because no handover meetings are needed anymore.
- If the team does on-call work, the quality of the software can benefit further, since everybody will do their best to not get paged at an inappropriate time.
- Changes in team structure: having small, independent, cross-functional product teams to break the IT silo.
- Removing handovers between development phases speeds up delivery and avoids miscommunications.
- Keeping teams closer to the domain experts will have a positive effect on the quality of the solutions and boost focus by reducing context switches, among others.
- It also promotes a modularization of systems (Conway’s law).
- Be aware that, by cross-functional, I don’t mean full-stack individuals, which I consider very difficult and even counter-productive. The idea of “T-shaped developer” might be more interesting.
- DevOps, containers, service meshes, distributed architectures, and the cloud. The infrastructure becomes self-service (the team can deploy without waiting for server allocation). It also becomes immutable, easy to replace, increases security, and much more.
- Automatic dependency updates.
- Better monitoring and incident management, such as SLI/SLO, DORA metrics, and other,s to close the feedback loop and take more informed decisions.
Apart from those practices, new tools, IDE improvements, frameworks, libraries, and language evolution allowed us to speed up the development of software.
If your company embraces most of the practices mentioned above, it probably means that software can be delivered much faster than before. And the changes are not only faster but also safer, since many tasks got automatized or delegated.
Some could think that software delivery is as fast as lightning, right?
In my experience, it still can take sooooooooo long to deliver VALUE, even when those practices are followed. And here are some cases:
It took us three days to understand the code that had to be adapted. It was the first time I touched that code. Thankfully, I was working with people that knew a lot about the application. Implementing the main changes with the corresponding tests took us two weeks. Finally, we spent around 16 weeks doing bug hunting throughout the code and implementing the fixes.
As you can see, a lot of time was spent on the third point. Of those 16 weeks, around 70% was hunting the bugs, around 20% was understanding the logic we found and thinking about how to adapt it to the new behaviour, 3% was spent writing automated tests, and 1% implementing the changes themselves.
Some reasons that explain why it took us so long:
- The service was doing way too many things.
- The code was quite tightly coupled because many things broke in different parts of the code.
- At the same time, the code was not very cohesive. The logic that managed the same concern was spread around.
We spent three days understanding what we needed to change. It took us one day to implement the changes and seven days for making tests pass again.
In this case, there was a large number of hacks in the code to fix previous bugs. Some of them were not even totally fixed. It was clear that some developers weren’t familiar enough with the technologies. Luckily, the problems were not critical and the customers didn’t notice. The bad part is that they popped up when applying the new changes, making testing almost impossible, so we had to fix them before finishing our task.
Another problem was that the stacks were not up to date, so many solutions that we wanted to implement wouldn’t work.
In these cases, you can see that a lot of time was wasted in debugging and understanding the code that needs to be adapted.
Apart from those, there were build lanes that would run forever (lots of code in the same application, slow and/or flaky tests, etc.).
So maybe you see where I’m heading. Otherwise, continue reading.
And the Forgotten Part Is…
In the previous sections, you saw that many practices around software development were adopted to make it faster: Agile, CI/CD, DevOps, team structure, etc. However, keeping existing code working while changing it can be a nightmare. A huge amount of time is still wasted in understanding code and chasing bugs around, even though a bunch of practices that would help have existed for ages. In the meantime, there is a lot of noise around Scrum, Kanban, independent teams, etc., and, as I mentioned above, the Agile Manifesto doesn’t focus that much on specific processes or rituals; it just gives some general ideas. So this leads me to say that:
Code (and test) quality (especially flexibility) is one of the most important parts of Agile that is being ignored or forgotten.
Code quality is not an exclusive problem of Agile, but it is especially important here, even more than in traditional projects. And there is the misconception that, in Agile, there is no need to plan anything or invest effort in software architecture and design, but it is actually the opposite. The code is expanded and adapted very often, so it is critical to make systems easy to change or they will make future development very slow. So, as for developing in Agile, the next iteration of the sentence could be:
System flexibility is one of the most important parts of Agile that is being ignored or forgotten.
And this is nothing new. It is mentioned in the Agile Manifesto as we saw in a previous quote, but there is more:
“Continuous attention to technical excellence and good design enhances agility.”
“Agile processes promote sustainable development. The sponsors, developers, and users should be able to maintain a constant pace indefinitely.”
How can a team keep a constant pace if the code is messy, difficult to understand, and change?
And keep in mind that quality doesn’t mean complexity. Simplicity is also important:
“Simplicity–the art of maximizing the amount of work not done–is essential.”
Architecture is normally associated with rigidity and planning ahead because we often think of buildings, which are difficult to change. From that point of view, it would make no sense to invest time in architecture or design in an Agile context because things change often.
But in the software world, we can choose architectures that leave options open and make changes easier. And we don’t need to think far ahead. In fact, Agile promotes thinking in the short term. It is well known that predictions are not easy and in the end, we pay the price. We don’t know what the customer will need even in two days.
For people who don’t do Agile, there is another quote:
“High performers understand that they don’t have to trade speed for stability or vice versa, because by building quality in, they get both.”
Consequences of Slow Development
Development gets slower as we add more code, and this is not only a problem for developers. Every stakeholder speaks in their own terms and has a slightly different meaning:
- For lovers of UX experiments: Experiments can take weeks instead of days.
- For product lovers: Reacting to market/meeting customer needs in time… will it happen?
- For managers: Missing goals, work accumulating.
- For DORA metrics lovers: Deployment Frequency (DF) is reduced, Mean Lead Time for changes (MLT) increases, Change Failure Ratio (CFR) increases.
- For the company: Even if we have great ideas, if they cannot be converted easily into software, the company will get stuck. Code will be a liability instead of an asset and in the meanwhile, competitors may take advantage.
- For developers: Having to deal with bad code can lead to burnout, or a laptop flying through the window.
So it is clear that development will get slower, but thankfully, you know that if you invest in flexible code, you can save some time.
However, it is not easy to convince some people of this. There is still a problem: The effects of rigid code are not seen immediately. They are visible in the long term. Also investing in developing flexible code might seem slower in the short term. So who would want to invest in something that doesn’t have an immediate effect? We know that people think mostly in the short term. We need to give more visibility to how the quality of code will affect future development.
Connecting the Effect of Rigid Systems With the Business
The effect of rigid code in how long it takes to develop something is not easy to spot. Thus, the effect on the business is also not obvious. Without a connection between both, it will be difficult to convince people of the importance of having flexible systems. It is a very complex topic but the gap needs to be filled.
The DORA metrics fix that, at least partially. Code is not homogeneous, code routes slowly, and the problems might be detected too late.
TCP (Technology Capability Plan) introduced by Glenn Engstrand in his talk Managing Tech Debt in a Microservice Architecture could be a starting point.
I don’t know an ideal solution for this. I dream of having a score that gets updated after every commit, which could be used as a multiplier for the estimation of how long a future task will take to be completed.
However, I would like to finish with some simple ideas about how to improve the flexibility of code.
Ideas for Improving the Flexibility of Code
There are many books and resources online about the quality of code, so I won’t go into many details:
- Code needs to be easy to understand. By reading the method name, it should be clear what it does. Understanding one method should not take more than, for example, 10 seconds.
- Code needs to be modular. Small pieces, loosely coupled, encapsulated. This applies at every level: system, library, package, class, and method.
- Code needs to be cohesive. Each piece needs to do only one thing and all the code that does that one thing needs to be close together.
- It is especially important to separate business logic from framework constructs like controllers, listeners, filters, etc.
- Code needs to be simple (don’t over-engineer, do not force design patterns in).
- Develop for current needs. When you think: “What if we need X functionality?”, apply YAGNI (You Are not Going to Need It).
For tests, keep in mind that they grant safety but they resist changes and can eat up your time more than the production code itself, so:
- They need to bring enough value. Doing TDD (Test Driven Development) is a practice that helps you achieve it.
- If they test implementation details, they will be very difficult to change.
- I’ve seen a lot of unit tests that verify pieces of code that provide internal results. Those pieces of code are the result of how a developer organizes the code. If the logic needs to be organized in a different way and moved around, many tests break. When this happens, tests don’t protect you anymore and are almost useless. It also gives the team extra work. So it is important to think about what is actually a unit. I’ve been thinking a lot about this after having to reorganize the code several times and I started to think that the concept of a unit may not just be methods or classes, but maybe it needs to be thought of in terms of features. This might sound weird, and similar to an integration test, but it is not. To my understanding, the book Unit Testing: Principles, Practices, and Patterns confirmed that I’m not crazy, or at least I’m not the only one.
- Favour simpler and faster tests: Unit over Integration. Considering integration at the application level; integration tests should take care of things that cannot be verified in unit tests, such as integration with frameworks.
- E2E tests are not needed by default for every feature. They can be important for features that would be life-threatening, PII critical, can make the company lose a lot of money or similar.
Keeping the system flexible is critical if you need to introduce changes often, as is the case in Agile environments. Otherwise, after some time, it will be harder and harder to incorporate changes. Other practices, such as the ones mentioned in the article, will help you compensate for that for a while, but in the end, their effects will be less significant. Many times, practices that help you keep the code flexible are very easy to apply.
There is more evidence every day that supports the fact that quality helps keep a constant pace in development. Also, new ideas appear that could connect the effect of rigid code in the business.