Some of you may be aware of Kent Beck's Four Rules of Simple Code that state simple code:
(I've seen some boil this down into some of the same rules for writing clear prose: correct, consistent, clear, and concise.)
Lately I've been noticing some parallels to the above and rules for what I would call "simple codelines" and I think there may be a similar way of expressing them...
Simple codelines:
To elaborate further...Correctly build, run (and pass) all the tests
This is of course the most obvious and basic of necessities for any codeline. If the codeline (or the "build") is broken, then integration is basically blocked, and starting new work/changes for the codeline is hindered.Contains no duplicate work/products
The same work and work-products should be done OnceAndOnlyOnce! Sometimes effort is spent more than once to introduce the same change/functionality. This is sometimes because of miscoordination, or simply lack of realization that what two different developers were working on required each of them doing some of the same things (and perhaps should have been accomplished in smaller chunks).
Other times, rather than modify or refactor a common file, some will simply copy-and-paste the contents of one or more files (or directories/folders) because they don't want to have to worry about reconciling what would otherwise be merges of concurrent changes to the common files.
This is akin to neglecting to refactor at the "physical" level (of files and folders) as opposed to the "logical" level of classes and methods. It adds more complexity and (over time) inconsistency to the set of artifacts and versions that make up the codeline, and also eventually adds to the time it takes to merge, build, and test any integrated changes.
If content is being added to the codeline, we want that content to have to be added only once, without any duplicate or redundant human effort.Transparently contains all the changes we needed to make (and none of the ones we didn't)
The above is sometimes the cause of much undesirable additional effort that is imposed for the sake of attaining traceability and ensuring process compliance/enforcement. Here, I mean to focus on the ends rather than the means, and I say transparency rather than traceability for that very reason.
If people are working in a task-based and test-driven manner, it should be simple to report what changes have been made since a previous commit and that only intended tasks were worked-on and integrated.
If a codeline is truly simple, then it should be very simple and easy to reveal all the changes that went into it without adding a lot of overhead and constraints to development. It should be easy to tell which changes/tasks have been integrated and what functionality and tests they correspond to. One very simple and basic means of tying checkins (or "commits") to backlog-tasks and their tests can be found here; others are mentioned in this article.Minimizes the number and length of sub-branches and unsynchronized work/changes
Branching can be a boon when used properly and miserly. It can also add a heck of a lot of complexity and redundancy for maintaining two or more evolving variants of the project. The additional effort to track and merge and build many of the same fixes and enhancements in multiple configurations can be staggering.
Sometimes such branches are useful or even necessary (and can help with what Lean calls nested synchronization and harmonic cadences). But they should be as few and as short-lived as possible, preferably living no longer than the time it takes to complete a fine-grained task or to integrate several fine-grained tasks.
Even when there are no sub-codelines of a branch, there can still be un-integrated (unsynchronized) work-in-progress in the form of long-lived or large-grained tasks with changes that have not yet been checked-in or synced-up with the codeline. Keeping tasks short-lived and fine-grained (e.g., on the order of minutes & hours instead of hours & days) helps ensure the codeline is continuously integrated and synchronized with all the work that is taking place.
Another (possibly less obvious form) of unsynchronized work is when there is a discrepancy between the latest version of code checked-in to the codeline, and the latest version of code that constitutes the "last good build." Developer's lives are "simpler" when the latest version of the codeline (the "tip") is the version they need to use to base new work off of, and to update their existing workspace (a.k.a. "sandbox").
When the latest "good" version of the codeline is not the same (less recent) than the latest version, it can be less obvious to developers which version to use and become less likely that they use/select it correctly. Some use "floating tags" or "floating labels" for this purpose where they "move" the LAST_GOOD_BUILD tag from its previous set of versions to the current set of versions for a newly passed/promoted build. Sometimes the developers always use this "tag" and never use the "tip" (except when they have to merge their changes to the codeline of course).
Even with floating tags however, it is still simpler and more desirable when the last good version IS the latest version. Even if the latest version is known to be "broken", the lag between "latest" and "last good" version of a codeline can be a source of waste and complexity in the effort required to build, verify and promote a version to be "good" (and can introduce more complexity when having to merge to "latest" if your work has only been synchronized with "last good").
Plus, this lag-time often leads many a development shop to separate merging (and integration & test) responsibilities between development and so called integrators/build-meisters, where the best developers can attempt is to sync-up their work with the "last good build" and then "submit" that work to a manually initiated build rather than being directly responsible for ensuring the task is "done done" by being fully integrated and passing all its tests.
Such separation often leads to territorial disputes between roles and build/merge responsibilities. This in turn often leads to adversarial (rather than cooperative and collaborative) relationships and isolated, compartmentalized (rather than shared) knowledge for the execution and success of those responsibilities.
So there we have it! Four rules of simple codelines.Simple Codelines should:
Sometimes there are legitimate reasons why some of the rules need to be bent, and there are important SCM patterns to know about in order to do it successfully. But any time you do that, it makes your codeline less simple. So you want those scenarios to be few and far between, and to keep striving for the goal of simplicity. (Other SCM patterns, such as Mainline, can help you refactor your codelines/branches to be more simple.)
Some of the discussion with my co-authors on our May 2008 CM Journal article on Agile Release Management spurred some additional thoughts by me that I hope to refine and work into a subsequent article later this year.
Release Management is about so much more than just the code/codeline (and it being "shippable") it's not even funny. Some other articles to reference and mention some key points from are:
Kevin Lee has written some GREAT stuff on Release Management that relates to Agile. The best is from the first and last chapters of his book on "The Java™ Developer's Guide to Accelerating and Automating the Build Process" but bits of pieces of it can also be found at:
ANY discussion about Release Management also needs to acknowledge that there is no single "the codeline", not just because I may have different codelines (Development-Line plus Release-Line) working toward the same product-release, but ESPECIALLY because no matter how Agile you are, the reality is that you will typically need to support MULTIPLE releases at the same time (at the very least the latest released version and the current under development version, but often even Agile projects need to support more than one release in the field)
So, when dealing with multiple release-line, and any "active development lines" for each of those, and the overall mainline, we really should say something overall about how to manage this "big picture" of all these codelines across multiple releases and iterations:
- What is the relationship between development line, release-line and release-prep codeline?
- How do the above three "lines" relate to "mainline"
- What is the relationship between the different release-lines for the different supported releases
- What is the overall relationship between the mainline and the release-lines (and if the mainline is also a release-line, which release is it?)
- The overall Mainline model
- The different types of codelines ("line" patterns), and what kinds of builds take place on each of them
- The relationships of those to the mainline
- When+Why to branch (and from which "types" of codelines)
- When+Why to merge across codelines (as a general rule)
These are where Laura's rules for "the flow of change" apply. And her concept of "change flow" is very much applicable to the Lean/Agile concept of "flow of value". The Tofu scale and "change flow" rules/protocol have to do with order+flow of codeline policies across the entire branching structure when it comes to making decisions about stability -vs- speed. One codeline's policy might make a certain tradeoff, but it is the presence of multiple codelines and how they work together, and how their policies define the overall flow of change across codelines, that forms the "putting it all together" advice that is key to release management across multiple releases+codelines.
In some way's you could make an overall analogy to the Laws or Thermodynamics and the realities of codeline management. Software and codelines tend, over time, to grow more complex and, if unchecked,
"Entropy" (instability) quickly becomes the most dominating force to contend with in their maintenance. See
The "entropy" (instability) doesnt just happen within a codeline. It can actually get far more hideous when it happens across codelines via indiscriminate branching from, or merging to, other codelines. This is what happens when you don't respect the principles and rules of "change flow" (from Wingerd) which ultimately stem from the rules of smooth and steady (value-stream) flow from Lean.
The Laws of Thermodynamics are about energy, entropy, and enthalpy. In the case of release management and codelines ...
- energy relates to effort & productivity
- entropy relates to stability/quality versus complexity
- enthalpy relates to "order" (i.e., in the sense of structure and architecture as Christopher Alexander uses the term "order"). It is the "inverse" of entropy.
We could call them them "Laws of Codeline Dynamics" :-)
Energy misspent degrades flow, creates waste, and hurts productivity/velocity. In traditional development, we often see "fixed scope" with resources and schedule having to vary in order to meet the "scope" constraint. IN Agile development we deliberately "flip" that triangle upside down (see the picture in the article at here under the title "The Biggest Change: Scope versus Schedule - Schedule Wins"). So we are fixing "resources" and "schedule" and allowing scope to vary.
This might be one way of viewing the law of conservation of energy. If we fix resources and time (and insist on "sustainable pace" or "40hr work week") then we're basically putting in the same amount of effort over that time-box, but the key difference is how much of that effort results in "giving off energy" in the form of waste ("heat" or "friction") versus how much of that energy directly adds value. Both "Value" and "Enthalpy" degrade or depreciate over time, and adding more energy (effort) doesnt necessarily mean value is increased.
To make sure that energy goes toward adding value (and minimizing waste) we need to focus on the flow of value, and hence the flow of change/efforts to create value (the latter is one reasonable definition of a "codeline" or a "workstream"). to ensure a smooth, steady, and regular/frequent flow, there are certain rules we need to impose and regulate stability within and across codelines to better manage all those releases.
Zeroth Law of Thermodynamics (from Wikipedia)
- If two thermodynamic systems are each in thermal equilibrium with a third, then they are in thermal equilibrium with each other.Translation to codelines ... this law of "thermal equilibrium" is a law of "codeline equilibrium" of sorts. (Does this mean If two codelines are are "in equlibrium" with a third codeline, then they are "in sync"? and with each other? here "in sync" doesnt mean they have the same frequency, it means their is some synchronization pattern regarding their relative stability and velocity. In Lean Terms, this would refer to "nested synchronization" and "harmonic cadence"). This might imply the "mainline" rule/pattern or one of Wingerd's rules of change-flow.
First Law of Thermodynamics
- In any process, the total energy of the universe remains the same.This is the statement of conservation of energy for a thermodynamic system. It refers to the two ways that a closed system transfers energy to and from its surroundings - by the process of heating (or cooling) and the process of mechanical work.
This relates to effort & changes expended resulting in the creation of value and/or the creation of waste. We have activities that add value (which we hope is development), activities that preserve value (which is what much of SCM attempts do, given that it doesnt directly create the changes, but tries to ensure that changes happen and are built/integrated with minimal loss of energy/productivity/quality), and then we have activities (or portions of activities) that create waste (and increase entropy rather than preserving or increasing enthalpy/order)
Second Law of Thermodynamics
- In any isolated system that is not in equilibrium, entropy will increase over timeSo this is the law of increasing instability/complexity/disorder. The "key" to preventing this from happening is achieving and then maintaining/preserving "equilibrium". How do we achieve such equlibrium? we do it with the "release enabler" patterns for codeline management (which help ensure "nested synchronization" and "harmonic cadence" in addition to achieving a balance or equilibrium between stability and velocity (to smooth out flow).
Third Law of Thermodynamics
- As temperature approaches absolute zero, the entropy of a systemIn our case, "Temperature" could be regarded as a measure of "energy" or "activity". As the energy/activity of a codeline approaches zero (such as a release in the field that youve been supporting and would LOVE to be able to retire that codeline sometime real soon), it's instability approaches a constant minimum.
approaches a constant minimum.
This is perhaps another more polite way of saying something we already said in our article on "The Unchangeable Rules of Software Change", namely that "absolute stability" means dead (as in, "no activity"), and should serve as a reminder that our goals is not the prevention of change in order to achieve some ideal "absolute stability", for such an absolute would mean the project not just "done" but "dead".
On the other hand, it also speaks to us as a guideline for when it is safe to retire old codelines, and when to change their policy in accordance with their "energy level"
My Agile SCM co-authors Rob Cowham, Steve Berczuk, and myself have written an article for the May CM Journal on An Agile Approach to Release Management
We're relatively pleased with the article, and all collaborated together quite well.
Nice little guide on InfoQ.com about Distributed Version Control - that's twice in two months that the "agile" section of InfoQ.com has had a decent article on the subject!
A colleague of mine had a question for me about Distributed Versions Control Systems (or DVCS). There are a growing number of such systems these days: Mercurial, Bazaar, git, svk, BitKeeper, Gnu Arch, darcs, Monotone, Codeville, Arx, just to name a few. I referred them to a good essay by David Wheeler that talks about the fundamental differences between distributed vs centralized VCS (among other things).
I also Googled on the topic and came across some interesting links:
Anyone else have any links they recommend on the topic? (please, no spam/marketing)
The November 2007 issue of the CM Journal was devoted to the theme of "What Best Practice is Best?" Joe Farah's article on The Top 10 SCM Best Practices was quite possibly the best article on SCM best practices that I've come across to date!
Joe actually first lists his top ten "runners up" followed by his top ten. They are as follows (read the article if you see any terms that are unfamiliar to you):1. Use of Change Packages
Granted, some of these may not be particularly Agile (nor were they meant to be specific to Agile development or Agile CM) but it's still a pretty darn good list in my opinion!
2. Stream-based Branching Strategy - do not overload branching
3. Status flow for all records with Clear In Box Assignments
4. Data record Owner and Assignee
5. Continuous integration with automated nightly builds from the CM repository
6. Dumb numbering
7. Main branch per release vs Main Trunk
8. Enforce change traceability to Features/Problem Reports
9. Automate administration to remove human error
10. Tailor your user interface closely to your process
11. Org chart integrated with CM tool
12. Change control of requirements
13. Continuous Automation
14. Warm-standby disaster recovery
15. Use Live data CRB/CIB meetings
16. A Problem is not a problem until it's in the CM repository
17. Use tags and separate variant code into separate files
18a. Separate Problems/Issues/Defects from Activities/Features/Tasks
18b. Separate customer requests from Engineering problems/features
19. Change promotion vs Promotion Branches
20. Separate products for shared code
Several years back there was an interesting open-source Eclipse project named "Stellation" at www.eclipse.org/stellation. It was to be an advanced/modern version control tool with lightweight branching and support for fine-grained checkout at the logical/semantic level.
Then about 2-3 years ago it just up and disappeared. I tried doing several websearches for it, but to no avail. Then on the revctrl mailing list I saw someone inquire about it, and I chimed in too wondering where it had gone.
Karl Fogel (of CVS and Subversion fame) replied with exactly what I was looking for ...You can get the whole project archived as a tarball here:
http://archive.eclipse.org/technology/archives/stellation-project.tar.gz
It used to live at http://www.eclipse.org/stellation/. Those pages have been pulled, with no redirect left behind (a bit annoying when an open source project does that!). But a search on eclipse.org pulled up this thread:
http://dev.eclipse.org/newslists/news.eclipse.foundation/msg00766.html
...which pointed to...
http://www.eclipse.org/technology/archived.php
...which pointed to the above tarball. Summary from the archive page:"Stellation is a software configuration management system designed to be an extensible platform for building systems based on the integration of advanced or experimental SCM techniques with the Eclipse development environment. The Stellation project will be using this system to integrate support for fine-grained software artifacts into the Eclipse environment, with particular focus on dynamic program organization, and inter-program coordination.
The Stellation website, newsgroup, mailing list, source code, and latest download are available in a compressed tar archive (110Mb) [http://archive.eclipse.org/technology/archives/stellation-project.tar.gz]"
So now you know! (and so do I).
Eric Raymond, famed OpenSource co-founder and Unix guru, author of The Cathedral and the Bazaar and The Art of Unix Programming, is currently working on a draft of a work entitled "Understanding Version Control"
It is an interesting read, and covers some of the more recent systems like Bazaar and Mercurial. For those who wish to give feedback, there is a link you can follow for a reviewers' mailing list.
The April issue of the CM Journal, and there is a FANTASTIC article in it by Austin Hastings about his Longacre Deployment Management strategy for dealing with database CM. It's long, but well worth the read for the insight into a new way of thinking about and doing CM of a database.
The April CM Basics issue has a companion/predecessor article a Case Study: Enterprise and Database CM the describes the initial problem, motivation and challenges that the LDM approach needed to solve. The LDM article goes into the gory technical details of the solution.
My paper in this month's issue of The CM Journal is about Lean-based Metrics for Agile CM Environments.
Some readers will recognize some of the content from earlier blog-postings of mine on Codeline Flow, Availability and Throughput, Nested Synchronization and Harmonic Cadences and Feedback, Flow and Friction, but there is also a lot more content there too!This month we take an "Agile" slant on metrics for CM, including the CM process itself. Agility is supposed to be people-centric and value-driven. So any metrics related to agility should, at least in theory, provide some indication of the performance & effectiveness of the value-delivery system, and how well it supports the people collaborating to produce that value. We borrow heavily from the concepts of Lean Production (and a little from the Theory of Constraints, a.k.a. TOC). Let's see where it takes us ....
The folks over at SmartBear software have written a nice little book entitled The Best Kept Secrets of Code Reviews. It's free if you go over to their webpage and ask for it (you have to fill out a registration form, and it takes a few weeks to arrive, but they havent spammed me at all since I registered with them a few months ago).
This is a pretty good book and it is VERY pragmatic! It is applicable to Agile development too! [You don't have to do Pair-Programming to be Agile! Pairing is part of XP, which is one particular agile method -- several other agile methods do not require it.]
SmartBear also has a pretty neat suite of tools that look to me like they would be REALLY USEFUL for an organization trying to streamline some of its otherwise heavyweight processes for peer-reviews and related quality metrics:
And "No!" they did not ask me to blog or say anything nice about them or their products! I'm simply coming from the perspective of someone in a large organization who has witnessed a lot of homegrown and heavyweight processes and tools for these kinds of things, and don't see too many commercial tools addressing the peer-review aspect of development and trying to make it lighter-weight and better-integrated with version-control and the rest of SCM.
The have some other nice resources too:
Looks like a lot of "good stuff" to me!!!
Just received two new books about version-control tools:
The Essential CVS, 2e book is one of the better CVS books available these days. I think I like it better than the classic one by Fogel, but not quite as much as the Pragmatic Programmers "Practical Version Control with CVS" (still - it's pretty close).
[See the sample online chapter containing the CVS quickstart guide]
[See the online sample chapter on "The Business of Outsourcing"]
To be honest though, I really dont feel like CVS is very desirable among free Version-control tool offerings when we have the likes of Subversion, Monotone, Arch, and others that support the more recent paradigms and higher-levels of abstractions for working with project-wide streams (branches) and more.
The VSTS book is rather interesting. The "Global Outsourcing" parts of the title, and some of the corresponding content, would likely "turn off" a lot of folks. It even has a brief section about Agile development (to which, you'd think "global outsourcing would be anathema).
Mickey Gousset published a review of the book back in October, and it's worth a read. I mostly agree with the comments he makes. I think the book is pretty good, but there is another one coming soon that I expect I'll like a whole lot better, as well as several VSTS books available from Amazon.com.
Still, if you need to do a lot of distributed development across geographically dispersed sites, and want to use VSTS not just for its versioning capabilities, but also the tracking and coordination capabilities, this is probably the book to get.
From Pete Behrens' Agile Executive Blog, the results to the Agile Tooling Survey they conducted in October are now available online at http://trailridgeconsulting.com/surveys.html:With over 500 survey responses from 39 countries, we feel this survey
provides an excellent benchmark for where the agile movement is at
today and how we are using project management tooling to assist our
agile processes.
This report builds a corporate profile of companies that are following
agile processes today and then uses that profile to analyze how they
are using project management tooling to support various aspects of
their agile processes.
It's rather interesting to see what sorts of tools are being used for version-control, defect/issue/enhancement-tracking (DIET), and project planning & tracking, particularly when some high-profile Agilists would have us believe that (other than version control) Agile should "eschew" such tools.
I don't think the problem is the tools. I think the problem is most of them were/are made and used in a non-agile fashion that didn't have the agile way of working in mind. Now that there are some tools out there which do, it seems they are helpful after all :-)
In my last blog-entry I wondered if the interface segregation principle (ISP) translated into something about baselines/configuration, or codelines, or workspaces, or build-management. Then I asked if it might possibly relate to all them,
Here's a somewhat scary thought (or "cool" depending on your perspective), what if the majority of Robert Martin's (Uncle Bob's) Principles of OOD each have a sensible, but different "translation" for each of the architectural views in my 4+2 Views Model of SCM/ALM Solution Architecture? (See the figure below for a quick visual refresher.)
Thus far, the SCM principles I've "mapped" from the object-oriented domain revolve around baselines and configurations, tho I did have one foray into codeline packaging. What if each "view" defined a handful of object-types that we want to minimize and manage dependencies for? And what if those principles manifested themselves differently in each of the different SCM/ALM subdomains of:
What might the principles translate into in each of those views, and how would the interplay between those principles give rise to the patterns already captured today regarding recurring best-practices for the use of baselines, codelines, workspaces, repositories, sites, change requests & tasks, etc.
The current issue of Communications of the ACM is focused on Software Product-Lines for software engineering. It has a number of interesting articles on software product-lines and product-families for large-scale reuse.
It even has a few articles related to CM of product-lines, particularly change-management and variability-management:
Jaejoon Lee, Dirk Muthig
Krzysztof Czarnecki, Michal Antkiewicz, Chang Hwan Peter Kim
Andreas Helferich, Klaus Schmid, Georg Herzwurm
Kannan Mohan, Balasubramaniam Ramesh
The November issue of The Rational Edge has three articles that are closely related to my ideas about applying what we know about software & enterprise "architecture" to the domain of SCM/ALM solutions (and another article about an SCM tool vendor "eating their own dogfood"):
[Also see some interesting articles on Requirements Engineering in the December 2006 issue of Crosstalk]
Some of you may recall my 4+2 Model Views of SCM Solution Architecture. I've since updated the picture a bit as follows (which Ive now updated over there too), click on the small image below to see a much larger one:
Anyway - reading through the above articles (particularly the one on model "dimensions" and the one on UML+RUP+Zachman together) gave me some more thoughts about my 4+2 views model, such as:
Anyways, these questiosn made me want to go look-up some of these other models and views. I found some pretty good online articles for some of them (I'm sure I missed a few as well). Here they are:
I welcome feedback/comments on any of my thoughts above!
My interests in CM, architecture, and agility all overlap in my day-to-day work. I think the convergence is in developing what I call an "SCM Solution Architecture". Some might regard it as the SCM "component" of an overall Enterprise Architecture that includes SCM. I believe many of the principles, patterns, and practices of system architecture and software architecture apply to such an SCM solution architecture.
If we take the 4+1views approach of Rational's Unified Process (RUP), which defines the critical architectural stakeholder "views" as: logical, physical (implementation), processing (processing & parallelism), deployment, and use-cases/scenarios, and if we enhance those with one more view, that of the organization itself, then we arrive at a Zachman-like set of RUP-compatible views for an SCM solution architecture that I call a "4+2" Model View of SCM Solution Architecture.
The "4+2" Model View of SCM Solution Architecture contains 6 different views, that I characterize as follows:
- 1. Project {Logical Change/Request Mgmt} -- e.g., change-requests, change-tasks, other CM "domain" abstractions and their inter-relationships, etc.)
- 2. Environment {Solution Deployment Environment} -- e.g., repositories, servers/networks, workspaces, application integration
- 3. Product {Physical/Implementation/Build} -- e.g., repository structure and organization, build-management scheme and structure/organization
- 4. Evolution {Change-Flow and Parallelism} -- e.g., tasks, codelines, branching and merging/propagation, labeling/tagging
- +1. Process {Contextual Scenarios/Use-cases} -- e.g., workflow, work processes, procedures and practices
- +2. Organization {Social/People Structures} -- e.g., organizational structure for CCBs, work-groups, sites, and their interactions, and corresponding organizational metrics/reports for accounting and tracing back to the value-chain. (Mario Moreira's "SCM Implementation" book has a great chapter or two on the importance of this and some best-practices for it)
The fact that many of the views closely align with RUP suggest that UML might be a very suitable diagramming notation for modeling such an architecture. And I think that much of the current best-practices of enterprise architecture, agility, object-oriented design principles, and service-oriented architectures apply to the creation of an agile CM environment that represents such a solution architecture.






