Maven 3.2.1 Provides a New Cure for Dependency Hell

If you’ve used Maven long enough you’ve seen this pattern.

  • You work somewhere that breaks up development into several independent groups.
  • Different teams have very different standards for dependencies and project organization.
  • One team sends over a JAR file that has transitive dependencies on everything from testing frameworks to unnecessary JDBC drivers.

I call this pattern the “Someone Needs to Train that Team on Maven” pattern because that’s exactly what needs to happen. Usually this happens at large enterprises set up to support multiple levels of development, say one team is working on a project that supplies a client to another team. One team is developing a REST service and, as a convenience, they supply a JAR that contains a simple model object and some code to interact with the REST service.

Easy, right?  Wrong.  A good developer will make sure that this client artifact has as few dependencies as possible.  Lean dependency signatures are key in large enterprises.  If the interaction with that other system is via REST, then there’s no reason to include backend code to interact with a database or the code that actually implements the service. If the other team has a limited understanding of Maven dependencies – then there will be trouble.  You will get a client JAR that happens to include everything and the kitchen sink – your 8 MB WAR file bloats up to 200 MB because it includes several versions of Spring (even though you don’t use Spring).

In Maven, dependency hell is often not due to the tool itself, it is self-inflicted and it quickly infects your entire organization’s Maven projects. One bad project, one developer with a weak understanding of when and where to declare dependencies can create a disaster that will bloat the dependency trees of projects that consume that artifact.  Anyone who touches an artifact with bloated transitive dependencies gets bloated dependencies.

Jason van Zyl at Takari points to the solution.  

In Maven 3.2.1, you can exclude all of the transitive dependencies for a dependency.  This means that, if someone sends you a JAR artifact with an awful POM, you can cut the problem off at the root.

I’ve seen some people do similar things by declaring a dependency as “provided” in dependency management, but this is both time consuming and incorrect.  The thinking here is that I can selectively cut out transitive dependencies by just declaring each transitive dependency as “provided”.  It hacks at Maven’s model to get around this short-coming.  Maven 3.2.1 has a more elegant solution to what I see as an unfortunate reality for most large-scale Maven projects.

And, that unfortunate reality is that most people using Maven have a limited understanding of how dependencies work. This is an ailment which is easily fixed with training.

I love these posts about Maven

It begins with “I hate maven.”, and it goes on and on. This is a person who is trying to use the Sonar plugin, and who hates Maven so much he can’t bring himself to understand the idea of running a repository manager, or even that he should think about upgrading to a version of Maven that was released two years ago.

Part of my mission is the help people understand Maven, but when I see a chap like this: someone so dead set against investing even a tiny particle of effort in getting things to work. I like to sit back and just watch. I like to see just how bad it can get.

The solution to this dude’s woes are simple: use a repository manager (there are several, they are all free) and upgrade to the latest version of Maven. You can see people in that thread giving him hints, and you can see the resistance.

couchdb4j is a Case in Point for Git and Maven

Yesterday, I decided it was time to test out some ideas about storing content in CouchDB. I just wanted to get some preliminary numbers on performance, but I also wanted to see how the thing would scale after loading 10 GB worth of data. So, I went about this by…

  1. Downloading couchdbx which is a one-click distribution of CouchDB for Mac OSX – Once you download this distribution, CouchDB loads as an application, runs on the default port. Downloading couchdbx, unpacking the archive, copying the app to Applications, and then running it. Within 2 minutes, you’ve got a running instance of CouchDB.
  2. Searching for a good Java library… – Right, so even though CouchDB is really all about HTTP and JSON, I still need to find an easy way to call CouchDB from a Java program. I settled on couchdb4j, it has a simple API, and it does everything I need it do.

At this point, I noticed that couchdb4j has a Git repository at: http://github.com/mbreese/couchdb4j. I clone the repository to my local system, I run “mvn clean install”… the tests fail. After some investigation, I realize that the tests are failing because I happen to be running the latest release of CouchDB 0.9.1, and couchdb4j only works with 0.8.

Point One: I didn’t have go fishing around the source code to figure out how to build this sucker. It just worked. Even though the tests fail, I’m up and running in a few seconds. The presence of a pom.xml file is a signal that I don’t have to spend time rifling through someone’s custom build. And, I now know that I’ll be able to make changes to the code easily using m2eclipse.

Once I figure out that I’m probably going to have to get into the source code to make some changes to bring this thing up to speed with CouchDB. I can easily fork the couchdb4j project in GitHub, and create my own fork at http://github.com/tobrien/couchdb4j/tree/master.

Because the CouchDB project maintains simple API documentation on a Wiki, it is easy to make the appropriate changes to couchdb4j to bring it up to speed with the changes in the CouchDB View API and the CouchDB Document API. I know have a fork of the couchdb4j that is 0.9.1 compatible, if other people want to use my changes they can freely clone my repo, or they can pull in the specific changes I made. I haven’t made a pull request, because I don’t know if the changes I made are aligned with the interest of the couchdb4j project.

Point Two: Git made it easier for me to make an instant decision to fork and customize to satisfy my own requirements. I didn’t have to stop and figure out what community dynamics. Because Github makes it so easy to fork and existing repository, I didn’t have to ask permission or chime in with “Hey, is anyone interested in XYZ.” I scratched my own itch, and if the person who maintains the original project finds my changes interesting, he or she can pull them into his own.

Within about an hour, I have an experiment with a customized version of couchdb4j that has been upgrade to 0.9.1 compatibility. Because couchdb4j followed the Maven conventions and because they decided to use Git, it was easy.

Unintentionally stirring a bee’s nest (by suggesting Maven)

I just had an odd exchange, someone has a great piece of open source software that I totally depend upon, it’s a complex beast of a thing, and I wanted to a.) express gratitude, b.) offer some help with the build. You see the build for this particular system is an Ant build script with a preamble of instructions, and the project itself is this megalith of code in one big src/ directory. Every time I want to use some new component that is in development, I have to download someone’s tarball, uncompress the thing and then fish around for JAR artifacts to upload to central. Some of the JARs that are included have specific Subversion revision numbers in the JAR file name. This makes using the binary artifacts from this particular project a royal pain in the neck.

I follow the project, I’m invested in the code, I thought it might make sense to *ask* if there was *any* interest in migrating the build to Maven on the grounds that it might make it easier for people to contribute and participate. Now note, I didn’t volunteer to switch the build the Maven, I simply “asked if there was any interest”, and not even on a development list. I asked one of the main contributors directly because I didn’t want to ruffle feathers.

What I got in return was this total tirade against Maven. How it isn’t flexible enough, how this particular person wanted to “send an invoice” to whoever was responsible for Maven because he had wasted so much time on it. Ending with the quote: “If our not using Maven as a build system is a problem for those who
do then it’s not our problem but the problem of Maven for not being
flexible enough.”
In other words, my very diplomatic inquiry was met with “#$@! off”.

Not “it’s been a while since I’ve looked at Maven, here are the problems I had, if you can get it working alongside this Ant build, be my guest…”. Or even, “No, I’m not interested in that. I had some problems in the past, and I don’t think it makes sense to distract from the current development.” I didn’t even get a chance to make an argument on the “merits”. So here it is…

The Argument

  1. If I can’t go to your website and figure out how to checkout the source code and build in 5 minutes, your project is a pain in the neck to contribute to. The casual contributor has no incentive to learn how your build system works.
  2. If, in order to use your library, I have to go download some archive, unpack it and then futz around with JAR artifacts, your project is a PITA to use.
  3. I don’t even care if you use Maven, all I really care about is that you publish regular SNAPSHOT builds to a repository. When I see some development release of a JAR that is wrapped in a tarball, contains a README file, and it bundles with other JARs, I end up having to upload all this noise to a repository manager crafting my own groupIds out of thin air.
  4. Yes, I understand, you hate Maven because it called you a bad name two years ago, and you didn’t know anyone qualified at the time to help you. If you had problems in the past, it was probably because you didn’t understand the tool. No offense. I can probably help out there.

If you disagree with some of the assumptions, that’s another matter.

Maven needs more opinion…

When I hear that someone has blogged about some general Maven hatred, I cringe and expect to read a post that consists of 30% incorrect assumptions about how Maven should be used, 50% ignorance of the most basic concepts, and 20% truth.

What can be done:

  • The Maven Users lists needs to become a bit more opinionated for first time users. If someone enters into the discussion asking the following question:

    “I’m attempting to publish a directory full of JARs to my local repository using the Install plugin.”

    The first reaction should be, “No, use a repository manager.” Not, “Let me find seventeen ways to help you do a series of backflips to get Maven to do something it was never intended to do.”. Maven isn’t the general Swiss-army knife tool that many approach it as. While it *can* be made to do anything you want it to do, there are core assumptions that shouldn’t be challenged. Half of the criticism we deal with is from people that were never told: “Don’t try this, don’t try to use Maven for this.”

  • The Maven community needs a better FAQ. I’m of the opinion that the lack of a good FAQ is directly related to the difficulty of the APT format. If Maven had a better FAQ, we’d have less people approaching it with the wrong assumptions.

Development Release of Archetype Chapter

I’m throwing together a quick chapter on the Maven Archetype plugin which was recently rennovated and rereleased with some interesting new features (like the ability to generate an artifact from an existing project). I’m starting with instructions for how to generate projects with artifacts, and I’ll eventually put some instructions for best practices for creating artifacts.

This chapter on Maven Archetypes hasn’t reached Draft status yet, it is in a pre-alpha stage. I’m publishing works in progress because I believe that transparency in writing benefits both the author and the community. A book is much more than the pages (or web pages) it is printed on, and the true meaning of a book is captured in both the content and conversation it provokes. As this is a pre-alpha release of a chapter, don’t worry about reporting typos. Expect them until a quality beta version of this chapter is released. If you do care to provide any feedback, tell me what you want to read. If, after reading this pre-alpha chapter you are longing to know how to X, Y, or Z. Go over to our Get Satisfaction page and file a suggestion or an
idea. We’re very interested in the feedback.

Don’t expect the Archetype chapter to be in pre-alpha for weeks and weeks, one thing I’m particularly disinterested in is leaving readers with cliffhanger endings – sections that provide 95% of the essential information only to leave them with a table that hasn’t been completed or a section that was written in a hurry. This is a new practice of “Agile Writing”, and I’ve taken care to publish complete sections. While the enumeration of third-party plugins isn’t complete and this chapter lacks a section on
generating artifacts, the paragraphs and third-level sections that have been published are in this version because I didn’t want to sit on the content for weeks and weeks.

Xpect ah lott of tipos inh this chapther(, but don’t report ‘em yet).

Buy the New Maven Book Today (Support Free Books)

I know it seems like something of a contradiction, but if you are happily reading the free Maven book from Sonatype (and I know tens of thousands of you are), I’d urge you to buy a printed copy of the book. Even if you buy it as a gift for someone else. Here are some good reasons to purchase this Maven book:

  • Maven is a part of your everyday routine and having a printed copy would help you track down that missing option or best-practice.
  • You like the idea of a seeing a free, open book developed in a transparent manner for all to see, and you want to support that effort.
  • You are trying to convince skeptical colleagues to use Maven. Throwing down a thick O’Reilly book gives your argument more weight than the free online version.
  • O’Reilly works with printers that have a net-positive impact on the number of trees planted (more on that in a later post)


Written from a User’s Perspective

I’ve been both a Maven evangelist and Maven’s most vicious critic. I’m the first to admit that Maven is and continues to be one of the most frustratingly under-documented tools, but instead of complaining to no end about this, I’ve invested months (and months) of my own time into helping write the latest Maven Book. It is a vast improvement over both the online documentation and the previous two books: the original Developer’s Notebook from O’Reilly and the Better Builds book from Mergere. This latest book was written to be both a good introduction and a proper reference, and we’ve published what I would consider to be a good stopping point (about half way to the complete reference I would like to ultimately publish).

There has been some good support for the book so far, a number of readers have written in to tell me that they thoroughly enjoy the book and that they have found it very informative and useful given the paucity of good Maven documentation on the web. Sure, there’s a very pronounced self-selection bias in the sample of feedback, but, unlike all the other books I’ve written, the feedback from this one is overwhelmingly positive. I haven’t received a negative reaction yet (although I’m sure this blog post will trigger one).

You Decide the Fate of Free Books

If you like the idea of free, open books, I’d urge you to order a copy from Amazon. We were up to #5 on the Java list, and I’d like to enlist your help in bumping us up to #1. Maven is the most popular build framework, and if you use it and want us to continue to publish and write free books, I’d urge you to buy the book.

Earlier in this post, I mentioned that this printed version is the approximate half-way point for the larger reference I would like to see published on Maven and the associated “constellation” of tools. If you enjoy the work we’ve completed so far, buy a printed copy.

Hadoop and the Inscrutable build.xml

Seriously, take a look at this build.xml. Where do you even begin? Here’s the groking process:

  1. build.xml references ${basedir}/build.properties. Go look for a build.properties.
  2. Look at README.txt… No help whatsoever…
  3. Look at NOTICE.txt… Again, just the ASF notice…
  4. Look in conf/ for build.properties… Nope…
  5. Look in index.html, see a redirect to index.html in the docs/ dir (hmmm, maybe that’s a doc about how to setup the development environment).
  6. Look at docs/index.html… errr.. nope, that’s just a web page.
  7. ok, all I was trying to do was build from source, but I guess it is time to google “hadoop build.properties”
  8. Result is this wiki page about IntelliJ (hmmm.. I don’t have intelliJ).

Alright, I spent 10 minutes flailing about. This is the accessibility of a project that decides to craft a 1300 line Ant build script with references to ${user.home}. None, there’s no casual contribution to this project. Anyone have a playbook for getting this thing to compile (not just the Java bits, either).