Yesterday, I decided it was time to test out some ideas about storing content in CouchDB. I just wanted to get some preliminary numbers on performance, but I also wanted to see how the thing would scale after loading 10 GB worth of data. So, I went about this by…
- Downloading couchdbx which is a one-click distribution of CouchDB for Mac OSX – Once you download this distribution, CouchDB loads as an application, runs on the default port. Downloading couchdbx, unpacking the archive, copying the app to Applications, and then running it. Within 2 minutes, you’ve got a running instance of CouchDB.
- Searching for a good Java library… – Right, so even though CouchDB is really all about HTTP and JSON, I still need to find an easy way to call CouchDB from a Java program. I settled on couchdb4j, it has a simple API, and it does everything I need it do.
At this point, I noticed that couchdb4j has a Git repository at: http://github.com/mbreese/couchdb4j. I clone the repository to my local system, I run “mvn clean install”… the tests fail. After some investigation, I realize that the tests are failing because I happen to be running the latest release of CouchDB 0.9.1, and couchdb4j only works with 0.8.
Point One: I didn’t have go fishing around the source code to figure out how to build this sucker. It just worked. Even though the tests fail, I’m up and running in a few seconds. The presence of a pom.xml file is a signal that I don’t have to spend time rifling through someone’s custom build. And, I now know that I’ll be able to make changes to the code easily using m2eclipse.
Once I figure out that I’m probably going to have to get into the source code to make some changes to bring this thing up to speed with CouchDB. I can easily fork the couchdb4j project in GitHub, and create my own fork at http://github.com/tobrien/couchdb4j/tree/master.
Because the CouchDB project maintains simple API documentation on a Wiki, it is easy to make the appropriate changes to couchdb4j to bring it up to speed with the changes in the CouchDB View API and the CouchDB Document API. I know have a fork of the couchdb4j that is 0.9.1 compatible, if other people want to use my changes they can freely clone my repo, or they can pull in the specific changes I made. I haven’t made a pull request, because I don’t know if the changes I made are aligned with the interest of the couchdb4j project.
Point Two: Git made it easier for me to make an instant decision to fork and customize to satisfy my own requirements. I didn’t have to stop and figure out what community dynamics. Because Github makes it so easy to fork and existing repository, I didn’t have to ask permission or chime in with “Hey, is anyone interested in XYZ.” I scratched my own itch, and if the person who maintains the original project finds my changes interesting, he or she can pull them into his own.
Within about an hour, I have an experiment with a customized version of couchdb4j that has been upgrade to 0.9.1 compatibility. Because couchdb4j followed the Maven conventions and because they decided to use Git, it was easy.