This blog is dedicated to the in-depth review, analysis and discussion of technologies related to the search and discovery of information. This blog represents my views only and does not reflect those of my employer, IBM.


Friday, October 13, 2006

Powered by Open Source (ApacheCon 2006)

When most people think of open source they think of software that can be download and used for free. But open source is more than that. Open source is challenging the traditional methodologies of software development and distribution and is having a profound impact on the software market as a whole. To learn more about this growing movement I attended the Apache conference held in Austin, Texas. Apache is one of the earliest pioneers of open source methodologies which all started with the development of its Apache web server.

The Apache web server had originated from the HTTP daemon written by Rob McCool in 1995. Rob had made his HTTPD source code freely available on the web which allowed it to quickly grow in popularity with savvy webmasters. Many of these webmasters started to develop their own extensions and bug fixes to the original code and eventually coalesced into a small group for the purpose of coordinating their changes in the form of "patches". This small group formed the basis of the original Apache group. Note that “Apache” does not refer to the patches applied but rather the philosophy behind the Apache Indian tribes who banded together for a common cause. The Apache group communicated with each other through email and setup a common machine for the sharing of code and the distribution of builds. In less than a year after the group was formed, the Apache server passed Rob McCool’s HTTPD as the #1 server on the Internet and according to the survey by Netcraft, retains that position today at over seventy percent.

In 1999 the Apache group formed the Apache Software Foundation (ASF), a non-profit organization designed to preserve its guiding open source principles and provide a legal framework for its products (e.g., licenses). Today the ASF hosts over 30 top level projects including Lucene (a search engine java library) of which I am particularly interested in. There are roughly 40 subprojects and a little over 20 projects in incubation. Incubation allows new projects to form in its early stages as they become acclimated to the open source philosophy and grow in acceptance.

So what is the open source philosophy? Simply put it follows the same processes demonstrated in the successful development of the Apache web server. It all starts with a software artifact that serves a need. The software and its source code are made available on the web. Anyone can download and use the software for free according to the terms of the Apache license. Over time the community of users of the software grows (depending on its utility of course). Users become expert users. These are users who are impassioned with the software, some to the extent that they want to affect its evolution. That is when they become contributors.

There are many kinds of contributors. Some could recommend new features, changes to documentation, test case scenarios, and/or identify bugs – anything that can positively affect the overall product. What is different though is that contributors are also encouraged to provide the solution, be it the code for the new feature, documentation, or the actual fix for the bug. Contrast this to conventional product development where bugs are identified and expected to be fixed by someone else for example. The proposed contributions are reviewed by the Project Management Committee (PMC) and if accepted injected into the official code base. Builds are frequent (sometimes on a daily or weekly basis) so that contributors can see their effects and feel a sense of reward.

There are many advantages to this kind of software development. Innovation is high because the contributors are impassioned believers of the product - so much so that they are willing to volunteer their own time to improve the product. Also, since users are the primary source of feature requests then the feature has already been validated in the market place (market driven development). And, for highly active open source projects the rate of software development increases leveraging the large number of participating developers from all around the world. This number could easily surpass the limited resources of a typical software company. It is also important to note that no one company owns the product and the control of its direction. The essence of the product is derived from the community as a whole.

Click here to read more...