Your Community is NOT Your Tools.
(Disclaimer: I’m one of the ‘old guard’ open source guys. I co-founded the Subversion project back in 2000 and am a proud member of the ASF. These opinions are my own.)
A very popular blog post has been going around lately called Apache Considered Harmful, which criticizes the Apache Software Foundation (ASF) for being impossible to work with. On the surface, it looks a bit like a culture war between older and younger generations of open source hackers: the older generation is portrayed as stodgy and skeptical of distributed version control systems, making the ASF inhospitable to a younger generation used to the fast-and-freewheeling world of git and Github.
One of the ASF’s leaders, Jim Jagielski, then wrote a blog response which seems to say, “We’re not irrelevant; we just have high integrity. We care about long-term health of open source projects, not passing fads or hip popularity contests.”
But I think Jim is truly missing the main complaint.
Backing up a bit: what is the mission of the ASF? Why does it exist? My understanding is simple:
- to be a legal umbrella of protection
- to foster long-term, healthy open-source communities
The first goal is achieved by putting all of a project’s code under the Apache license, and getting all code contributors to grant nonexclusive IP rights to the ASF. This guarantees that the ASF “owns” the code, and thus can legally defend it.
The second goal is about encouraging and preserving healthy culture. The ASF has a famous saying: “community over code”. In other words, the ASF doesn’t accept donations of code (or code thrown over walls), it only accepts communities that happen to work on a common codebase. The community is the main asset, not the source code.
The ASF has a great set of cultural norms that it pushes on its communities via political means and lightweight processes. For example, the ASF requires that each community have a set of stewards (“committers”), which they call a “project management committee”; that communities use consensus-based discussions to resolve disputes; that they use a standardized voting system to resolve questions when discussion fails; that certain standards of humility and respect are used between members of a project, and so on. These cultural traditions are fantastic, and are the reason the ASF provides true long-term sustainability to open source projects. It’s the reason I pushed so hard to get the Subversion project into ASF.
Let’s go back to the original “Apache Considered Harmful” post again. Yes, the blog post rambled a bit about the ASF becoming “irrelevant”, but I think that’s just random grumbling around the actual issue at stake: the ASF’s insistence on forcing their hosting infrastructure onto projects. We have repeated examples of mature open source communities trying to join the ASF, which already use git as their version control system — and the ASF is insisting that they convert to Subversion and store their code in the ASF’s One Big Subversion Repository.
I fear what’s happening here is that the ASF elders have tragically confused “be part of our community” with “you must use our infrastructure”. There is no reason for these things to be entangled.
The ASF has teams of people dedicated to running servers for Subversion, SSH, QA testing, email lists, and so on. Ten years ago, infrastructure hosting was a Hard Thing. Getting to use the ASF’s hosting services was considered an attractive perk. These days, project hosting is utterly commoditized: we have Sourceforge, Google Code, Github, and other sites. In a matter of minutes, any two people can conjure up a hosted source repository, bugtracker, wiki, etc. So is it really a surprise that newer communities, ready to join the ASF, already have functional (and possibly superior) tools and infrastructure?
So why oh why does the ASF demand everyone use their Subversion service? They don’t force every project to use the same bugtracker; I wonder if source code is different because it’s the “special” asset being protected. Perhaps the ASF elders think it has to all be in one place in order for it to be protectable and controlled? A simple solution here is to simply require that at least one canonical copy of source code be stored on ASF servers. If that means doing an “hg pull” or “git pull” via cron job every hour, so be it. Who cares where the real coding is happening, or in how many repositories it’s happening in? Irrelevant. As long as a community has blessed a central repository as Official, and the ASF is keeping a synced copy of that somewhere, we should be all set. The ASF’s job is to shepherd communities, not force everyone to use the same software tools.
Ironically, years ago I too was suspicious of distributed version control, and wrote an article about how it tended to discourage ASF-style project cohesion. But in this case, we have examples of communities that are already cohesive and high-functioning, despite using git. They don’t need ASF’s tools; they just need a nice place to park their community. If they ain’t broke, stay out of their development processes.
(Note the ASF isn’t alone in this insanity. Others have told me that FSF projects are forced to use the Savannah collaborative platform, whether they want to or not. Crazy! Repeat after me, folks: your community is not your tools.)
“we have examples of communities that are already cohesive and high-functioning, despite using git.”
I suspect it may also be BECAUSE of git. I’ve found that my projects, while not as large as those run by ASF, have become more cohesive and collaborative, sometimes by serendipity, since moving to git/GitHub from svn/Google Code.
Not at all an argument for tools. Just adding voice to your thesis that community != tools.
Yeah, I’m channelling the ASF-elder bias that somehow “using git” is the equivalent of community chaos. 🙂
+1. Just +1 all over that. What Ben said.
Sorry, I wish my comment had more to contribute, but Ben already said exactly what I was thinking. I guess maybe I’d add that if the ASF is worried about communities being locked in to proprietary services (such as GitHub), then something for the Infrastructure team to work on would be writing periodic pullers that use those services’ APIs — which they all have now, AFAIK — to pull a backup over to the ASF. Then if the service ever goes insane (unlikely, but it’s happened, e.g. the CDDB debacle) the ASF will have all the assets at hand and can solve the mildly harder problem of setting up similar infrastructure for the community (either at the ASF or at some other hosting service).
So for example, with GitHub that would be: the “master” repository and all the forks one can find, all the wiki data, and all the issues in the bug tracker. There are APIs for all of this: in the case of the git repository, git itself provides the APIs, and for the rest GitHub publishes its own APIs. Similarly for Google Code (I’ve used [1] the bug tracker APIs there and they’re fine).
Third-party hosting platforms, whether proprietary (GitHub) or not (Gitorious), are now the norm for open source projects. The ASF will either adjust to this reality or become obsolete.
-Karl
[1] http://code.google.com/p/projport
> “Let’s go back to the original “Apache Considered Harmful†post again. Yes, the blog post rambled a bit about the ASF becoming “irrelevantâ€, but I think that’s just random grumbling around the actual issue at stake: the ASF’s insistence on forcing their hosting infrastructure onto projects.”
The point of my article was not to lament about Apache’s policy on tools, which I admit that I tend to do, but to point out the “debate” over that policy as the most recent example of the ASF’s distance from the new open source culture we see emerging.
> “They don’t force every project to use the same bugtracker”
This is news to me, I didn’t go after JIRA because, frankly, it’s too easy, but I was unaware that projects were free to pick their own bug tracker. What project do not use JIRA and what do they use?
I use, and love, git/github but I think the core issue is of allowing pull requests to be merged from people without signed CLAs. That “chain of custody” cannot be guaranteed via merges from github; if github is a read-only copy then what’s the point?
I couldn’t possibly function without the awesomeness of github pull requests and issues “magic”, so I hope they figure this out. It seems as if github integrating CLA support (even if it’s just a checkbox that I, as a project admin can set on a per-user basis) would help solve this issue.
Issue Trackers are the same at the ASF. Roughly it is a case of ‘must be on our infra’. The difference is that Bugzilla still runs, largely for the older projects who don’t want to move to JIRA.
Regarding that the GNU project (which you incorrectly refer to as FSF projects) forces anyone using Savannah. It is recommended and encouraged that GNU projects host their code on Savannah, but it is far from forced. Even which version control system is very much up to each project; but the recommendation is GNU Bazaar.
Having everyone use the same tools has many benefits, for one everyone knows where everything is, and it is easy to help each other.
Cheers, Alfred
Glad to hear that the GNU project doesn’t force folks into Savannah; encouragement is different from forcing. 😉
FYI: subversion.apache.org still uses the same tigris.org issue-tracker that its been using for 11 years.
I think some of the git-resistance comes from different cultural assumptions around the cost of merging. I know from an svn mindset, the “mash the branches together and call it merged” mindset of the git-world can seem a little fast and loose. Likewise, I suspect the older “merging is to be approached with fear and reverence” culture may seem hidebound when you’re used to the flagrant reticulation, devil take the hindmost style that the git world embraces.
— MarkusQ