A Django site.
April 26, 2008
» Apply Elegant Architecture to Your Dev Team: Part II

Previous: "Apply Elegant Architecture to Your Dev Team: Part I"

When you are creating new software, what do you think about? Don’t you think about how it will scale to meet your needs as they grow with high availability? Even if you don’t achieve that on the first try, you are still thinking about it and striving to achieve it. You know that to do it, you will need to make the right technology and architectural choices. Over time, you may need to change some of those choices to keep pace with competitors. Even if your current needs are modest, software development architecture has evolved technologies and patterns to allow software to scale from a single user to hundreds or even thousands of users on multiple platforms at multiple locations with 99.999% uptime.

If you consider your development organization in the same way, how would you apply the same thinking? What technologies are you using? What is the architecture of your organization? Will it scale from its current size to double its size? Will it scale seamlessly to include new teams in new locations? What will happen if you acquire a company? When creating software, you want to design it so that it is flexible and adapts to new circumstances. The same should be true for your development organization.

Another way to look at it is how a particular process would fare if each of the resources available were available at seemingly random times for random periods of time. This exactly describes the world of Open Source Software (OSS). At any given time you have no idea who will be contributing on what or how valuable the contribution will be. You don’t know where the contribution will come from and in many cases you don’t even know who the contributor actually is. This is an extreme example of a development situation. Even though your situation is probably not as extreme as this, by using techniques from the OSS world you will be better positioned to handle unexpected events when they inevitably occur.

Problems are an inevitable and regular part of life. Examples include illness, job change, human error, flight delays, system failure, unanticipated market changes, and natural disasters. The ability to cope with these problems is one measure of the robustness of the organization.

Part of robustness is that as things scale up or down, the impact is minimal. Tools and processes should be selected and designed to work well together whether they are used by 1 person or 10,000 . At all times, it should always feel like each individual is part of a team which is no bigger than 12 people. If practices are not scaleable from 1 to 10,000 then people can develop bad habits that resist scaling. If you develop habits that exist in a scaleable framework, then it is more likely that scaling can and will happen as and when needed.

Next: How Agile Solves Problems

» Advanced Multi-Stage Continuous Integration

In parts 1 through 3 of this series, I described how Multi-Stage Continuous Integration can be used at a team and multiple-team level. In this post, I will describe how Multi-Stage CI can also encompass other development stages.

The Development Hierarchy
So far, we’ve split the mainline up into a hierarchy of branches. Each developer has their own workspace and is part of a team, each team has its own branch, and each team branch is based on and merges back to an integration branch, which may be the mainline. Sometimes work is organized at a feature level instead of at a team level. For simplicity, I’ll just refer to teams. Each part of the hierarchy corresponds to a stage of development. This hierarchy enables multiple parallel development pipelines where changes move from a low level of maturity to a higher level of maturity and the work in the parallel pipelines is continuously integrated.

We’ve covered three stages of development: individual developer coding, team integration, and full project integration. But there are many more stages in a typical project lifecycle, even when you are doing Agile development. Traditionally, the stages of the lifecycle occur in several different places including project plans, the SCM system, and the issue tracking system. This adds development stages such as: code reviewed, system tested, UAT passed, demoed, etc. Each of these stages can be added as a branch to your branch hierarchy.

When using Multi-Stage CI, unlike the typical use of branches, changes do not linger on a branch any longer than required to pass through the stage that branch is associated with. The difference between any given branch and its parent in the hierarchy cycles rapidly between almost nothing and nothing. The goal is to push changes through the hierarchy as rapidly as possible.


When working within a development hierarchy, the impact of any given change is limited in scope. If there are 4 members on a team, then only 3 other people are affected when a change is merged into the team branch. Because changes are made on the team branch only when the developer has things working in their own version, changes at the team level are made less frequently and have a lower probability of destabilizing the build. Once the team branch is considered stable, changes are then merged into an integration branch which consolidates the work for multiple teams. Again, changes are made less frequently and have a lower probability of destabilizing the build. Thus, the graph at the integration level shows fewer changes and the changes are smaller. As you proceed towards the root of the hierarchy, changes are less frequent and less likely to destabilize the build until finally you reach the root where you have code that is always shippable.

Getting Started
Getting to Multi-Stage CI takes time, but is well worth the investment. The first step is to implement Continuous Integration somewhere in your project. It really doesn’t matter where. For a recommendation on an excellent book on Continuous Integration, see the bibliography. The next step is to implement team or feature based CI. Once you have that working, consider automating the process. For instance, you can set things up such that once CI passes for a stage, it automatically merges the changes to the next level in the hierarchy. This keeps changes moving quickly and paves the way for easily adding additional development stages.

I’ve seen Multi-Stage Continuous Integration successfully implemented in many shops and every time the developers say something like: “I never realized how many problems were a result of doing mainline development until they disappeared.”

Next: It is Better to Find Customer Reported Problems As Soon As Possible

March 27, 2008
» Apply Elegant Architecture to Your Dev Team: Part I

There is another way to think of software development which allows you to leverage your skills as a developer and apply them in a new way. The process of developing software is, in effect, an algorithm implemented with various technologies. You can think of your team, the technologies you use, and your development methodology as a piece of software.

What do you do with software? You improve it. You add new functionality, you increase its performance and usability and you remove bugs. Also, software itself is a document: source code. It is a description of how to perform a set of tasks. You improve the software by changing the source code. Let’s call the combination of your team, tools, and techniques: “the process” and the source code for the process “the process document.”

Not only can you think of your development process as software, you can think of your whole development organization and the people in it as a combination of hardware and software with various communication links. I don’t recommend that you take this advice literally and think this way on a regular basis, or treat people as interchangeable cogs in the machine! However, by thinking about it this way you can leverage your technical design skills to think about how to organize and optimize your development organization and development process. You can leverage well-known design patterns. For instance, consider the communication aspect.

In the world of information transfer there is bandwidth, latency, and connectivity. The best environment for communication is high bandwidth with low latency that is always connected. The worst environment for communication is low bandwidth with high latency that is infrequently connected.

This gives a well-known context for discussing human interaction. On one end of the spectrum is self-communication. If you are responsible for two interdependent modules, then when you make a change to one, you instantly know you need to make a change to the other. You won’t misunderstand yourself or need to have a back and forth conversation to really understand. You just know. On the other end of the spectrum you have two people from different cultures revising a document via e-mail who live exactly half way around the world from each other. In between you have pair-programming, collocation, people working together but sitting on different floors of the same building, folks with a great deal of physical separation in the same time zone, a different time zone, etc.

Then there are different forms of information interchange such as video link, phone, e-mail, IM, wiki, document, etc. This way of thinking can guide your decisions about where to seat people, the value of having high bandwidth links, etc.

Next: Apply Elegant Architecture to Your Dev Team: Part II

March 26, 2008
» Is Your Dev Team Having Performance Problems? Try Niagra!

Have you ever thought “why do the companies I work at have so many problems with developing software?” Well, I can tell you from interacting with literally thousands of software development organizations that most development organizations have problems with reliably and predictably producing high quality software. What I hear over and over again is “if only we could hire the right people” or “if only people could be more disciplined.” The other thing I hear over and over again are various tweaks to traditional development that people believe will fix things but that “other people just don't get it.”

Well, there are only so many people out there to hire so unless we find some magical place to hire more of “the right people” from, we are going to have to figure out how to use the people we have. And I hate to break it to you but you just aren’t going to get more discipline out of people. Whatever discipline that exists today, that’s all you are going to get. So what’s left? Changes to the way we do things.

Let’s say there was some combination of techniques which would mean that most development groups could use traditional development to reliably and predictably produce high quality results. Let’s call it the “Niagra” method. Furthermore, you (or whoever wants to make this claim) get to define the Niagra method. It works every time. Just use Niagra and performance improves overnight (when used as directed).

If the Niagra method existed, it would have become widespread by now. It would have become the #1 prescribed solution to project performance problems. If it existed and produced the claimed results, people would start referring to it, and it would spread rapidly. That’s how C++, Java, and HTML became mainstream. People rapidly adopted them because they provided benefits that anybody could realize. In fact, there are many proposed Niagra methods, but none of them have become mainstream because they only work under special conditions.

This raises an obvious point. If traditional development is so bad and is such a big failure, then why has the software industry done so incredibly well? The software industry is an incredible success, that’s absolutely true. It is incredible how much of our daily lives runs on software and how much software has positively influenced our quality of life. The benefits of software, when it does finally ship and it does finally have a high enough level of quality are worth waiting for. Otherwise, of course, there would no longer be a software industry.

There have been incremental improvements to the software development process which have produced incremental improvements and have become widespread. Examples are: the use of software development tools such as source control and issue tracking, breaking down large tasks into small tasks, nightly builds, one-step build process, moving from procedural programming to object-oriented programming and many others. The important point here is that each of these improvements have become widely adopted and made things somewhat better, but still didn’t solve the root cause. The process is still unpredictable and unreliable.

This is akin to the O(n) approach to problem solving. If you have an algorithm which takes 100*n^2 operations, then getting that down to 50*n^2 is a good thing, but changing the algorithm to (for instance) be 200*n is much better. To date, changes to the software process have been of the first variety, shrinking the constant.

Perhaps Agile development is the answer. But isn’t Agile “yet another fad” which will blow over any day now? After all, we have had CMM, 9001, TQM, Critical Chain, and many other “next big thing” ideas in the past. What makes Agile any different? Previous ideas have all been centered around the idea that the framework underlying traditional development is fine, it is the implementation, discipline, people, or management of the people that is the problem. No, it is the underlying framework that is the root problem.

Certainly, people issues are important. Getting a team gelled and working well together with the right combination of skills is essential. But the success rates of these teams isn’t exactly stellar either! Yes, it is better, but they have similar problems with predictability and quality. A root problem remains. I do not claim that solving this problem means that suddenly teams that have insufficient skills and hate each other will start pumping out high quality software on a regular basis. But I do claim that Agile development has the ability to dramatically increase the potential of any team. And to truly succeed you have to do both: create a good team, and use Agile development.

Let's take a deeper look at the fundamental problems with traditional development.

Next: Traditional Development is a Chinese Finger Trap

March 25, 2008
» Is Your Software Development Organization Mainstream?

Have you ever wondered how your organization compares to other organizations when it comes to the process of developing software? Do you feel like there is an area that really needs improvement, but everybody else says "that's the way everybody does it?" While looking at what is involved in mainstream Agile adoption, I realized that it would be useful to create a list of the current mainstream software development practices for comparison purposes and I thought other folks might find it useful as well.

I define a practice as being mainstream if it is a practice that most people would agree that most people do. It may not be that they do that exact practice, they may do something equivalent or further along the spectrum of what constitutes a good practice, but they at least go to the level described.

The word "practice" is the key word here. This is about what people can be observed as actually practicing in their day to day real work. Things which are described by policy or documented as the correct way of doing something but avoided and worked around don't qualify. A mainstream practice is something that is in common usage that people do because they believe the value is worth the effort and not doing it is sort of like saying “electricity and hot and cold running water aren’t for me, I much prefer candles and fetching water from the well.”

While each of the following individual practices is considered a mainstream practice, that does not mean that it is mainstream to being doing all of the following practices. That is, in some of the areas that these practices cover, any particular development organization may be operating at a lower level than these practices, but for any particular practice, most development organizations operate at or above the level of these practices.

Overall

Basic flow – the common development activities are: talk to customers and/or do market analysis, document market needs via requirements, create a design, implement the design, write tests, execute the tests, do find/fix until ready to ship, ship.

Preparation

Requirements documented using Microsoft Word or Excel – while there are actually quite a few off-the-shelf requirements tools available, most people do not use them. The most common method for documenting requirements is via Word and Excel.

Recording of defects in Bugzilla – there are very few organizations which aren’t using at least Bugzilla to track bugs and Bugzilla is definitely the most popular choice. And recording of defects is pretty much as far as it goes. Enhancements, RFEs, requirements are tracked separately.

Basic initial project planning – this is the simple act of picking a bunch of work to be done, estimating it, dividing it up among the folks that are available to do the work and determining a target ship date. Some teams use MS-Excel, some teams use MS-Project. This does not mean sophisticated project planning. The extent of re-planning is a simple “are all tasks done yet” with the occasional feature creep and last minute axing of important functionality at the last minute.

Basic design – prior to coding features or defect fixes that will take more than two days to do or are highly impactful, a design document is created, circulated, discussed, and updated.

Development

All source code is stored in a source control system – this is not to be confused with “all work is done in a source control system.” It is actually surprisingly common to see the source control system used for archiving of work done rather than as a tool for facilitating the software development process.

Source control and issue tracking integration via a hand-entered bug number in the check-in comment, without enforcement – while there are certainly more sophisticated integrations, most people at least type in the bug number when checking in changes so that they can later find the changes associated with that bug.

Defects have a defined workflow that is defined in and enforced by the bug tracking system – even if it is as simple as Open, Assigned, Closed.

Mainline development – all developers on a project check-in to the mainline (aka the trunk).

Refactoring – while the term “refactoring” has come to mean just about any change to the software, the idea that code should be periodically cleaned up and simplified is pretty much universal at this point.

Nightly build – the entire product is built every night. This does not necessarily mean that tests are also run automatically or that there is an automatic report of the results, just that there is a nightly build.

Quality

Mostly manual testing – this one is the hardest to understand. The process of testing software is one of the most backwards parts of software development.

Unit tests – while the overall testing of software is still mostly manual, unit testing has caught on rapidly in a very short span of time.

Using defect find/fix rate trend to determine release candidacy – this is not actually a particularly good practice, but it is what most people do.

Separation of developer, test, and production environments – you might think this is so obvious it isn’t even worth mentioning, but this principal is violated enough that it is worth mentioning that it is actually a mainstream practice. I mention it to emphasize that if you aren’t doing this, you really need to.

Releasing

Basing official builds on well-defined configurations from source control – the official process for producing production builds includes the step of creating an official configuration in the SCM system and using that configuration to do the build.

Releasing only official builds – all builds that go into production are produced using the official process.

Major and minor releases - creating a major release every 6 to 12 months and minor and/or patch releases on a quarterly basis.

What do you think? Are these right on target or way off base? Do you think there are any missing areas? Do you think the mainstream practice level is higher or lower for some of these? Let me know what you think!

Next: "There is no Bug. It is not the Bug That Bends, it is Only Yourself."

January 27, 2008
» Scaling Agile and Stand-up Meetings

In the "ask away" section Matt poses the following question:

We're doing Scrum on a 10-12 person team (depending on what the DBAs are working on at the moment) and one of the more interesting problems I've been talking about with people is how to scale the process. I have a friend on a 200+ person project and they do Scrum-of-Scrums every day so one person from each "lower level" scrum team goes to another scrum meeting and so on up to the meeting with all the main project heads. This sometimes ends up taking 2 hours out of people's days if they have to go to a bunch of the scrum-of-scrums. What are your feelings on scaling Agile processes up to larger teams? Do you sacrifice somebody to be the "meeting person"? To me communication has always been the most important part of any Agile process, how do you keep that in large teams?

Scaling Agile
At a high level, it seems like your question is: “how do you scale Scrum?” In my experience, scaling is mostly independent of methodology. There are some Agile practices which can limit scalability, but luckily those practices are not required to get the primary benefits of Agile development. The practices which limit scaling are: relying on collocation and using 3x5 cards to the exclusion of other methods for storing and distributing project information. Other than that, what did the 200+ person do to scale prior to Scrum? Whatever it was, there is no reason that I can see to prevent them from scaling using those methods. Speaking specifically about Scrum, the Scrum-of-Scrums is the method recommended by Ken Schwaber for scaling project information flow.

Another area which is often a problem with scaling in general is the frequent integration required by the short iterations. For that you may consider Multi-Stage Continuous Integration.

Stand Up Meetings
As for the stand-up meetings, those are mandatory in Scrum. I don’t happen to agree with that requirement, but let’s hold that thought for a moment.

First, let’s take a look at the expense side of the Scrum meetings. First, people have to get to it. You have to wait until everybody is there. And then you have to get back to your computer. Scrum discusses how to minimize this time, but practically speaking, there is more overhead than just the ideal 15 minute meeting. If you are at a larger company, somebody has to book the room and let people know where it is. Let’s call the cost of the meeting 20 minutes per person. If you have 12 people in a Scrum, that’s 4 person hours per day. That’s the equivalent of half of a person. Those Scrum meetings had really better be worth it!

If they are worth it, then there is actually no problem and if somebody going to Scrum-of-Scrums has a personal dislike of meetings, even when they provide value, then perhaps somebody else should be doing the Scrum-of-Scrums.

Now let’s take a hard look at the stand-up meeting itself. One of the basic ideas of Agile (and Lean) is continual self improvement. If the value of the meeting exceeds the cost, then there’s no problem with the meetings, especially if they are eliminating other meetings. If the Scrum meeting is the only remaining status meeting, that seems like a good thing. However, continuous improvement means we’re never satisfied. Now that you are down to just the one meeting, you should still ask the question: “is it providing more value than the cost? Is there a better way?”

What is the purpose of the Scrum? To quickly find out if people are getting their assigned work done and if not why not. Isn’t it more efficient to do that via e-mail, IM, an issue tracking system, or other means? Someone might say “but the socialization aspect is worthwhile.” Ok, so why not separate the two? By all means let's have some social activities, but that's not a good reason for having a status meeting.

Or perhaps the Scrum is needed because otherwise folks wouldn’t complete their work, or people wouldn’t speak up when they run into an impediment. In that case the Scrum isn’t a solution at all. For instance, perhaps somebody isn’t completing their work because they don’t like it, but the constant peer pressure of the standup meeting is goading them into completing their work anyway. So then the real problem is lack of job satisfaction or low morale or something along those lines. Until you fix that problem, the standup is just acting as a band-aid for an underlying management problem. If you find that having standup meetings from time to time helps to expose management problems, then by all means have the meeting(s), find the problem(s), but don't have the meetings just for the sake of having them!

Short Iterations
One more thought on this topic: the real measure of project status and health is having an increment of shippable value at the end of every iteration. A standup will only expose problems that are on people’s minds, but the forcing function of the increment of shippable value is where you will get the true picture of how things are going. A one month iteration interval is good, but if you can get it down to 2 weeks or even 1 week, that will do far more to expose real problems than a standup will.

Speaking of short iterations, IMHO that’s really the number one thing which makes Agile scale better than traditional development. It is much easier to manage and coordinate large multi-team projects when the scope of everything is constantly focused on one month iterations (sprints). Of course, it may seem more difficult at times, but that is often due to the fact that problems are much more apparent and can’t “hide out” like they do in long iterations.

In summary, experiment and see what works for your particular situation, but always be on the lookout for opportunities for process improvement. Don’t do things by rote, demand that all activities provide real value.

Have a question on Agile or Software Development? Please visit the "ask away" section.

January 17, 2008
» Multi-Stage Continuous Integration Part III

In Part I and Part II I discussed the problems associated with Continuous Integration when combined with mainline development. In this part I’ll discuss a solution: Multi-Stage Continuous Integration.

Have Some Self Integrity

As an individual developer, you practice two simple best practices which you probably don’t even think about any more.


While you make frequent changes to your workspace which means its stability goes up and down rapidly, you only check-in your changes when you feel like you’ve done enough testing of those changes that you've sufficiently reduced the risk of impacting the stability of the mainline. Second, you only update your workspace when you are at a point that you don’t mind taking whatever the impact is from other people’s changes. These two simple practices act as buffers which protect other people from the chaos of your workspace while you prepare to check-in and protect you from problems that other people may have introduced during that same period. It also produces a variant version of the software for each developer which must all be integrated together. The longer you put off this integration, the more painful it will eventually be. Thus the need for Continuous Integration.

When you as an individual are working on a change, you are often changing several files and a change in one file often requires a corresponding change in another file. While it may seem like a bit of a trivial case, you can think of this process as self-integration. The reason that you work on those files on your own instead of having several people work on it is because the tightly coupled nature of the changes requires a single person.

It's All for One and One for All
The next level of coupling is at the team level. There are many reasons why a set of changes are tightly coupled, for instance there may be a large feature that can be worked on by more than one person. As a team works on a feature, each individual needs to integrate their changes with the changes made by the other people on their team. For the same reasons that an individual works in temporary isolation, it makes sense for teams to work in temporary isolation. When a team is in the process of integrating the work of its team members, it does not need to be disrupted by the changes from other teams and conversely, it would be better for the team not to disrupt other teams until they have integrated their own work. But just as is the case with the individual, there should be continuous integration at the team level, but then also between the team and the mainline.

Multi-Stage Continuous Integration
So, how can we take advantage of the fact that some changes are at an individual level and others are at a team level while still practicing Continuous Integration? By implementing Multi-Stage Continuous Integration of course! For now, I’ll just cover the basics, but in a future post I’ll go into more detail and cover advanced topics as well.

For Multi-Stage CI, each team gets its own branch. I know, you cringe at the thought of per-team branching and merging, but that’s probably because you are thinking of branches that contain long-lived changes. We’re not going to do that here.

There are two phases that the team goes through, and the idea is to go through each of them as rapidly as is practical. The first phase is the same as before. Each developer works on their own task. As they make changes, CI is done against that team’s branch. If it succeeds, great. If it does not succeed, then that developer (possibly with help from her teammates) fixes the branch. When there is a problem, only that team is affected, not the whole development effort.

On a frequent basis, the team will decide to go to the second phase: integration with the mainline. In this phase, the team does the same thing that an individual would do in the case of mainline development. The team’s branch must have all changes from the mainline merged in (the equivalent of a workspace update), there must be a successful build and all tests must pass. Keep in mind that integrating with the mainline will be easier than usual because only pre-integrated features will be in it, not features-in process. Then, the team’s changes are merged into the mainline which will trigger a mainline CI. If that passes, then the team goes back to the first phase where individual developers work on their own tasks. Otherwise, the team works on getting the mainline working again, just as though they were an individual working on mainline.


Multi-Stage CI allows for a high degree of integration to occur in parallel while vastly reducing the scope of integration problems. It takes individual best practices that we take for granted and applies them to the team level.

With Agile, you don’t have the luxury of doing big-bang integration at the end of your development cycle, but when you are scaling Agile beyond a single small collocated team, mainline CI doesn’t mix well with short iterations. Multi-Stage CI solves both problems.

Per-team CI is one form of Multi-Stage CI. In the next part of this series, I’ll introduce some more opportunities for Multi-Stage CI.

Next: Advanced Multi-Stage Continuous Integration

» Reducing the Risk of Producing a Hotfix

Let's say there is a problem reported in the field which requires a hotfix. Well, if you are doing traditional software development, you can't exactly ask the customer to wait until you finish your current release. And if you were to put out a fix for them using your main development process, that would probably take too long too. So, you have a hotfix development process. By definition, this is a development process which is not used for regular development and it is not used very often (so one would hope).

As a result, you have one process for regular development and one process for hotfixes because your regular development process takes just too darn long. That means that when you most need for things to go smoothly you are going to use the process which is the least practiced and probably also something that only a handful of folks know how to do or are allowed to do. What's wrong with this picture?

The solution is to get the path from deciding to make a change and being able to deliver that same change as short as reasonably possible. If the "overhead" from start to finish for a hotfix is 4 hours, then any development task should take you no more than the overhead plus however long it takes you to write the code and the associated tests. All development tasks should follow this same streamlined process, not by cutting out truly necessary steps, but by ruthlessly removing steps that add no real value and automating as many of the remaining steps as possible.

Once you have a sufficiently small overhead, then you really only need one development process which you use for both regular development and hotfixes! Since it is the same process, everybody is practiced in your hotfix process and everybody has experience with doing it. This will reduce risk and increase confidence in the results.

In practice, there will probably be at least two differences. It is unlikely that anyone needing a hotfix will want to take the risk associated with taking any changes other than the hotfix, no matter how ready for release you say those other changes are. So, you will need to develop the fix against the release that the customer is using. You will probably also deliver the fix using a slightly different path than the usual. However, the closer you get to just these two differences, the better off you will be.

There is a shortcut you can take on the way to getting to a single process. You can refactor your regular process such that you act like every development task is a hotfix, and then do the rest of the process in a second stage. For instance, if you normally do a subset of your full test cycle for hotfixes, then always do that subset first during regular development.

For some software projects, getting the full test cycle down to a short time frame will be impossible or impractical. In this case, refactoring to have the first stage be the same as the hotfix process is still recommended and keeps you focused on keeping the second stage as small as possible.

January 5, 2008
» Multi-Stage Continuous Integration Part I

I’m a big fan of continuous integration. I’ve used it and recommended it for many years. But it has a dark side as well. When it is combined with the practice of mainline development, especially for a large development effort, it can turn into “Continuous Noise.” In this case, you get notified every 10 minutes or so (depending on how much building and testing is going on) that the build and/or test suite is still failing.


In the diagram, the stability of the mainline is graphed over time. In this example, the graph starts when the product is released and the mainline is 100% stable. It builds without errors and all tests pass. As new development starts, developers make changes that break the build or cause tests to fail. Because they are all working against the mainline, it is likely that just as one developer fixes the problems they caused, another developer is checking in more changes that will create one or more new problems.

As the stability of mainline goes up and down, everybody that is working against mainline is affected. If the mainline doesn't build or there are tests that don't pass, then everybody has to wait until the problem is fixed.

It could be argued that the problem in this case is that the developer that broke the build should have updated their workspace, done a full build, and then run the full test suite after that. Basically, a full integration build and test. In practice this doesn’t work very well.

For any meaningful project, a full integration build and test will take time. Even if you somehow get that time down to 10 minutes, a lot can happen in 10 minutes, such as other people checking in their changes and invalidating your results (whether you realize it or not). It is more likely that this full cycle will take on the order of an hour or more. While you can certainly do something else during that hour, it is inconvenient to wait and necessitates task switching which is a well-known productivity killer.

Even though developer integration build and test has its problems, many shops have implemented it. While this practice is not ideal, it is an idea that is headed in the right direction.

Next: Multi Stage CI Part II .

» Multi-Stage Continuous Integration Part II

In Part I of this topic, I said that while Continuous Integration (CI) is a great practice, it can turn into “Continuous Noise” when combined with mainline development. Also, when the mainline is broken, everybody is affected until it is stabilized. I mentioned the idea of developer integration builds to solve this problem and pointed out that this can lead to task-switching and still doesn’t remove the problem of mainline instability.

In typical mainline development, in order for you to integrate your work with anybody else’s work, they first need to make their changes available by merging their changes with the mainline. Now, everybody is forced to integrate these changes. If the process of integrating your changes with another person or team’s changes requires multiple go-rounds, then it is much worse. Now, everybody else must continually integrate with two groups of changes as you and the other team go back and forth merging your changes into the mainline until they work together.

This creates a huge opportunity for duplication of effort. If you are working on module A which interacts with B and C, and B doesn’t currently work with C, then you are blocked until B and C are integrated. You might decide to take a look at it yourself, not knowing that somebody else is already working on it. Sure, you might be aware of who is responsible for B and C, but perhaps they are in a meeting right now and you have no good way of finding out. In any case, you are blocked. It may be that you knew ahead of time that there was a problem with the mainline. In that case, you just don’t take updates and concentrate instead on your own work. In effect, your ability to easily take changes from the mainline is “down” until the mainline is stable again.

A typical coping mechanism in this case is to serialize integration and ask everybody who is not involved in the current integration effort to hold off on merging in their changes until the current integration work has finished. This turns parallel development into serial development and kills productivity.

Next: Multi-Stage CI Part III

December 29, 2007
» The Role of Defect Management in Agile Development

There are some who recommend against using a defect tracking system. Instead, it is recommended that when a bug is found, it is fixed immediately. While that is certainly one way of preventing an ever growing inventory of defects, the tracking of an inventory of defects is one of the smallest benefits of a defect tracking system. Overall, a defect tracking system serves as a facilitator. It simplifies the collection of defect reports from all sources. It isn’t just the developers responsible for fixing the defects that find problems. Customers, developers working on dependent systems, and testers also find defects. Even if you have a policy of fixing defects as soon as they are found, it isn’t always logistically possible to do so. For instance, if you are currently working on fixing a defect and in the process of doing so you find another one, you don’t want to lose track of it. Thus, a defect tracking system coordinates the collection of defect reports in a standard way and collects them in a well known location, insuring that important information about the functioning of your system is not lost.

A defect tracking system also manages the progress of work through the development life cycle from reporting, to triaging, to assignment, to test development, to completion, to test, to integration, to delivery. It simplifies the answering of customer questions such as “what is fixed in this release” and “what release will the fix appear in.” A defect tracking system also allows for the collection of metrics which aids in the spotting of trends. I have heard from multiple sources that metrics collected from a defect tracking system are worthless because developers will just game the system. That may be true in an unhealthy environment where the metrics are tied to compensation. However, in an environment where developers are actively participating in the improvement of the process, they will want this information in order to help to find and fix problems, including the root cause of individual problems.

November 11, 2007
» Podcast: Open Source Technique for Agile Development

Check out this podcast to hear about how "meritocracy" allows developers to contribute outside of their traditional areas, to build trust in their capabilities, and to allow them to naturally gravitate to the areas where they are the most effective.

» Podcast: Mainline Chaos

Here's a podcast which covers these topics:

  • More on the practice of a "development hierarchy"
  • Checkpoints
  • Gatekeepers
  • Atomic Transactions