Get on with it, Programming Books are Wikis

I used to write books. It was fun, it took forever. It drove me crazy, and it involved feeling great about a printed artifact. Years of struggle and focus yields a perfectly printed, bound bit of physical awesomeness.

The first book I published was with O’Reilly. It was great. (Look, I’m writing a book, how awesome is that.) Twelve months later after a series of battles with 500-page Microsoft Word documents and nights and weekends of effort I ended up with a book. At the time, 2004-2005, we still had a lot of these things called “book stores” and I was still able to go to a book store and browse a section of Java books.

Try that today?

The Java section in your local bookstore is five books. The programming section isn’t even a full bookcase, it is stuffed between the (enormous) section devoted to Adobe Photoshop and a selection of books about IT management. In some smaller markets you’d be hard pressed to find a programming section at all. Unlike the last century, very few of us are thinking:

“Oh, I really need a book on Java, I’ll hop into the car and drive to the local overcrowded mall, fight my way past advertisements for the Nook, drop $49.99 on a programming book. I’ll take that sucker home and add it to my collection of programming books.”

Think again, today’s programmer goes through the following process:

“Hmmm, I want to use Clojure, first I’m going to Google ‘clojure book’ to see what is out there. I’m going to stumble upon the website and I see that there is an online reference. You know $29.99 for a Clojure book that is probably out of date isn’t worth it for me. This stuff moves too fast for publishers to keep up. I’m just going to keep on Googling for specifics. I just got done throwing away all my programming books, why add the clutter.”

This is the cue for a publishing executive to pop up and talk about the power of electronic publishing, Kindles, Nooks, iPhones, iPads and the like. Yes, there are people making money selling programming e-books, but it still doesn’t feel appropriate for the format. The last time I purchased a “programming book” to read on my iPad the formatting was screwy because the length of lines required for code listings was wider than the physical device. It was also a pain in the neck to copy examples between the iPad and the computer.

Yesterday’s programming book is today’s collection of Github repositories, blog posts, files, Stackoverflow posts, and the occasional quick start guide. If there’s a book really worth buying it is the exception as publishers like Packt and Apress tend to just put a cattle call out for just about anyone with a pulse to write.

Today’s book is a web page, a web page that is consumed in response to a Google search. Google is, in many cases, the primary reader and the only reader you should optimize for. Today’s “book” isn’t even a book at all, there are no linear sections, and there’s no guarantee that the reader is going to read chapter 1 before chapter 2. A great programming book is no longer a “read it from start to finish” introduction it is a series of recipes with little or no “voice”.

So, you can continue to waste time thinking about formatting and typesetting and whether the pre-print proofs are perfect enough to send off to a printer. Or, you can just realize that today’s book is a wiki, let’s get on with it.

Use the Creative Commons License for Free Books

One important thing to consider if you are planning on writing a free book is the license for the work. Traditional software licenses have some clauses that are not relevant to books or electronic media. Lessig’s Creative Commons makes sense because he wrote it with books in mind. So where a license like Apache or GPL talks about “binaries” and “source”, Creative Commons talks about “works” being “published”. Even though the books I write have DocBook source code which is compiled into binary output, the Creative Commons talks about “Work”.

Here’s the definition of “Work”, which I find amusing because it mentions “circus performers”. I often wonder if one of the Creative Commons folks put “circus performer” in the definition of Work on a dare:

“”Work” means the literary and/or artistic work offered under the terms of this License including without limitation any production in the literary, scientific and artistic domain, whatever may be the mode or form of its expression including digital form, such as a book, pamphlet and other writing; … blah blah blah …; or a work performed by a variety or circus performer to the extent it is not otherwise considered a literary or artistic work.

I write books under:

Which is very simply explained on the Creative Commons web site as:

“This license is the most restrictive of our six main licenses, allowing redistribution. This license is often called the “free advertising” license because it allows others to download your works and share them with others as long as they mention you and link back to you, but they can’t change them in any way or use them commercially.”

Most of the books I write have some sort of commercial interest associated with them. For example, I want to make money as a consultant or as a trainer. Or, the sponsoring company wants to sell a product or training. Although I know the next part of this sentence will appear invisible to Free Software Foundation people, I’ve found it necessary to have a protective No Derivatives clause for technical document as I’ve already had instances where content I’ve written has magically appeared in someone else’s (lucrative) training material.

If you’ve written a technical book, you are also very familiar with the idea that there is no control. Someone is going to take your content, muck around with it, provided it as a free download on any one of a hundred sites designed to pirate books. Any selection of a license is really just aimed at people who are going to follow the rules.


Here are the clauses I choose, and why…

Attribution – Just makes sense. Not controversial with anyone, right?

No Derivatives – I’m a bit old-fashioned, I feel like a book is a complete work. And, I don’t want people picking and choosing parts of the book to publish. I’m also not particularly keen about people jumping up and creating a derivative work that rides off of a work I’ve helped create.

Non-commercial – This has a few meanings in the context of a technical
book. It means that no one but the originating author (or sponsor) can sell the book as a part
of a commercial offering. Someone else could sell the book at a
reasonable cost like printing plus materials plus labor costs, but most people don’t want someone to take their book and start selling it in some expensive package deal at least some discussion about licensing.

The other part of Non-commercial is that someone can’t purchase ten
copies of your book and then use it as a basis for a training class.
This is tough to enforce, and I’m not sure you could realistically enforce this.

If you are more into the GPL
side of life, you could drop the No Deriv / Non-commercial, and go
with a Share Alike clause.


If you are thinking about starting a free book, think about the different clauses that are appropriate to your situation and choose wisely. Some of the clauses I listed above are far too restrictive for most open source work, but, if you are starting to write a book, you’ll want to think about these issues.

Royalties and Reality

This post should be something everyone who has ever received a royalty check should read. I’m not going to say I’ve ever received a royalty check that was anything less than truthful, but I do resonate with the idea that publishers don’t understand how to pay royalties on digital downloads. (Oh, and for some reason, most publishers only give you half royalties on digital downloads vs. print books.)

Take books as an example, when I get a royalty statement it often has a line item for “online sales” that is some sort of number that captures the number of sales of a particular book? What does this mean in the context of a subscription service? While I’m sure I could get a straight answer from the publisher, it is usually such little money that it isn’t worth the time or effort.

This story is interesting because it suggests that the recording industry knowingly plays a cat and mouse game with artists hiding true sales numbers behind opaque royalty statements (which I will also admit always look as if they were printed by a old dot-matrix printer). The larger issue here is that the publisher often holds all of the power in the relationship between producer and publisher.

Free Books Build an Audience, Just ask the Founder of SAS

From today’s NYTimes article on SAS and the challenges it faces from IBM here is an interesting scene. Goodnight built an audience sending free books. Granted, this was before the advent of electronic booksellers, but I’m fascinated by the idea of Goodnight packing up free book shipments and sending them to potential customers. It is exactly the strategy that today’s technology companies could learn from. Send your audience something free, they will remember you forever.

From the article:

“THE company traces its roots to a time when computing was costly and for the few. Originally called Statistical Analysis System, it was founded in 1976 by Mr. Goodnight and three colleagues from the agricultural statistics department at North Carolina State University. Its techniques were initially used to calculate the intricacies of soil, weather, seed varieties and other factors to improve crop yields.”

To build an audience, Mr. Goodnight spent nights packing up boxes of computer tapes and manuals, which he sent to university and corporate researchers. Soon, companies wanted him and his academic colleagues to develop software tools tailored for industry. In 1976 at a users’ conference, 300 or so people showed up, many from business.”

The Right Price for a Free Book

I’ve got an interest in the answer to the following question:

What is the right price for a 300-page book that is freely available online?

If you have an opinion, please vote in this poll:

Some Price Points

Before you vote, here are some data points assuming that you are using a service like Lulu or a Short-run printer:

  • The cost to print and bind a 300 pager at Kinkos will approach $35.
  • The cost to print 100-copies of a 300 pager at Lulu in a lot of 100 will cost around $10.00 per unit for the content creator.
  • The cost to print 100-copies of a 300 pager at a local printer with a turn-around time of 3-5 days is $8.75 per unit.
  • The cost to print 100-copies of a 300 pager at a short-run printer with a turn-around time of 4 weeks is about $6.60 per unit.
  • The cost to print 1000-copies of a 300 pager at a short-run print will approach something like $4.70 per unit.

Depending on how you do it, distribution costs are probably around $3.00 per unit, and I’m also assuming that the customer pays for shipping.

Some Initial Thoughts

$39.99 is not the right price for a free book, but neither is the correct price the per unit cost. I think the answer is in between, but I’m wondering what people think about this question. There are certain books that people will pay a premium for even though they are available online: The Subversion Book, Maven: The Definitive Guide. Books like these are on foundational tools that people need to use, and I also think that a portion of the audience purchasing the print book might not be aware that the book is available online for free.

As free books become more popular as an option for technical authors, I’m looking for some way to price them. My initial guess is that a good price for a printed 300-page book that is available online is about $14.99. Anyone have any thoughts? Your answers will influence some decisions I’m trying to make on the subject.

Some Thoughts on Lulu

Ok, I’ve been using Lulu for book distribution for a few weeks. First reactions follow.

Where Lulu Works

  • Domestic Distribution (US): – I’m frequently sending 40 book shipments throughout the US and Europe. Domestic distribution is a big win for Lulu. Although the site itself states that it should take between 3 and 5 days to print a book, I find that it is consistently taking about 3 days to print an order. With 2 day delivery, I’m finding that my packages get to where I’m sending them on time without an issue.
  • Ease of Publishing – They’ve made it easy to publish books. That’s it. If you are looking to publish a book, there’s likely no easier interface. They provide the right feedback at the right time, and they give you the resources to get your PDF ready for pre-print.

Where Lulu Fails

  • Lulu’s Site is Slow – From what I can tell, the management hasn’t put an big emphasis on site performance. Lulu’s site is constantly taking seconds (sometimes minutes) to load pages.
  • Lulu’s Order Process has Gaps – Did you make a mistake on the shipping address? Think you can just go back and modify the address immediately after your order has been placed? Think again. If you catch Customer Service at the right moment, you might be able to save yourself right away, but more often than not you are going to have to go through an email-only support channel. When you’ve just placed a $500 order for books only to have put it on the wrong credit card or shipped it to the wrong address, you’ll be looing for a quick way to either cancel or change an order. They don’t have it. There are serious gaps in the user interface, there should be ways to fix immediate ordering errors.
  • European Distribution Unreliable – I’ve had European orders sitting in fufillment for more than two weeks. No follow up, no one letting me know where the order is, nothing. Some of the orders I’ve placed have arrived on time, but most take forever. While I’m happy with the Domestic distribution, I’m never using Lulu for european distribution again. I’m not paying a premium above printing costs for bad service and missed deadlines.

It works, I’m not going to stop using Lulu for smaller orders of around 20-30 books, but I’m likely going to start going direct to a printer and handling my own distribution. Since I’m taking about short-run lot sizes of 100-250, I can get the per unit cost of the book down to a price point that makes sense.

Self-publishing experiments (Lulu vs. BookSurge)

Jury is still out, I’ve decided to do some self-publishing experiments so I can get a sense of what is out there. I’ve uploaded a book to Lulu (I’m not telling you which one, and it is still a private book, so you can’t buy it.) Some initial reactions…

  • It is very affordable to self-publish – I’m surprised at how much it costs to print a book wholesale. In fact, if you are not concerned about turn around time, uploading a PDF to Lulu and ordering a one-off copy of your own book is about half the cost of printing a document directly to Kinkos.
  • Both Lulu and Booksurge allow you to use your own ISBNs, if I do start using either service for distribution, I’m assuming that it is preferable to have your own ISBNs…. we’ll see.
  • Signing up for BookSurge generated an almost immediate call from a sales agent, where Lulu is all about self-service. Also BookSurge looks like it has more of an upfront, sign-up fee. I’m going to try both services over the next year, but I think I’ll start with Lulu.
  • I was looking for a 7″ x 9″ format which is about one inch wider than US Trade. Lulu doesn’t offer this size, the best fit I could get was Royal Quarto @ 7.444″ x 9.681″. I guess that will do, but I’ll have to figure out how that plays out. The problem with US Trade size is that the 6″ width is going to mean that I have to trim code examples to less than 80 character-width. We’ll see….

Open Source Writing: Part I: A Few Problems with Publishing…

If you are just tuning in, Common Java Cookbook is an experiment in transparent, open writing. I’m trying to develop this book and make frequent releases every one to three days. The idea behind this book is that open source writing should be no different than open source software. This is the first post in series that explores some of the reasons why I’ve decided to commit myself to open, transparent writing. This post focuses on the problem. What is wrong with the current approach to computer “books”? What is wrong with the current relationship between the author and the publisher? This post focused on some of the problems with the current approach to books about computer programming.

Problem: Driven by the Physical Artifact

While most writing projects are governed by the limitations of the book as a physical artifact, books like Maven: The Definitive Guide and Common Java Cookbook choose to fully embrace the idea that a book is an electronic documentation unaffected by the constraints introduced by the printing process. Most programming books you encounter today have to have a practical deadline after which no changes are introduced. In other words, if you are writing a book that needs to be printed in lots of five thousand and shipped to book stores, your process is always affected by the idea of the book as a static, physical object. You have to “finish” the book by a set deadline. Updating and radical changes to a book which has already been printed tend to decrease book, and (quite often) the original authors retain no rights for redistribution online.

This attachment to the physical object is driven by the economic realities of the publishing industry, but it creates an odd situation when you are writing about a rapidly moving open source project. There is a large disconnect between how we develop open source software and how we write books about open source software. Successful open source projects usually don’t have a set release date, software like Maven is released when it is ready. Imagine how awful open source would be if everyone had to run around like headless chickens to cut a CD for something like Apache HTTPD. Imagine if a Maven release vote were predicated by “People, if we don’t send the Maven ZIP file to the CD factory by next week, they might cancel our contract. Can I get three +1 votes, now.” It just seems odd that we have to dance around publisher deadlines when we are writing books about collaborative, unpredictable, schedule-less open source projects.

Problem: Deteriorating Economic Model

Take, as an example, the Jakarta Commons Cookbook. I wrote this book between 2002 and 2003, and I probably invested about an entire year in the effort. It was my first book, so progress was very, very slow. The book was published, I felt great about the process. I think every first-time author has this initial excitement about having published a book. I didn’t write the book for acclaim, I wrote it because it was my way of giving back to the community. A year passes, and you get the sales figures back and you, the naive author, are impressed that five thousand people bought the book. You get a flood of email from people who have read the book, maybe 10% are fuming mad at typos and the other 90% is just happy to have read the book. The publisher has a totally different view, 5,000 copies is actually viewed as a quarter success, the publisher would have liked to sell 10,000. While you feel great about the idea of a community of 5,000, the publisher is lukewarm about the idea of printing a second edition.

Right right right, 5,000 is a loser? Visualize 5,000 people in a line all holding $20…. If that’s a failure, if that doesn’t justify a second printing, then something is wrong with the model. These days, publishers don’t like to commit to books that are not going to move a significant number of copies. It is becoming more and more difficult to sell a good book to a publisher because as the open source world continues to evolve every topic becomes a niche topic with a limited audience.

Problem: Where’s my community….

When you sell 5,000 copies of a book, you certainly get feedback both good and bad… But, you don’t get the customer relationships. You don’t get a chance to interact, and you certainly don’t establish any sort of persistent HTTP 1.1 connection with your readership. Publishers provide some tools to enable this support: forums, blogs, etc. If you’ve grown used to the “intimacy” and unstructured creative anarchy of open source communities, you’ll feel a bit stifled. Efforts like Jono Bacon’s The Art of Community are an attempt to address this, and publishers like Pragmatic have done a good job of creating that sense of community… But, as an author, you will want to either create that community yourself or (better yet) integrate that community with the community that has already developed around the project you are supporting.

Publishers serve an important curation function they provide the necessary work to ensure that the book meets production standards has come to be expected in a book, but they often don’t do a great job organizing a community. Just like an open source project manages software production, I think authors and open source projects should manage a community of readers. Publishers used to be a necessary intermediary, but as the importance of the book as a physical artifact continues to decrease, I think we’re going to see authors take more initiative and publish works online.

Should I Stop Authoring and Start Pirating?

Andrew Savikas of O’Reilly writes about Piracy on O’Reilly Radar When Authors Ask Us About the Consequences of “Piracy”. He quotes Nat Torkington:

Fantastic! There’s absolutely nothing you can do about it, and unless you see sales dipping off then I don’t think there’s anything you *should* do about it. The HF books work really well as books, so at best the torrents act as advertisements for the superior print product (not often you can say that with a straight face). At worst most of your downloads are going to people who wouldn’t have bought the book at cover price and who will, if they enjoy it, rave about it to others. [emphasis added]

So long as the royalty checks are strong, take BitTorrent as a sign of success rather than a problem. A wise dog doesn’t let his fleas bother him.

Everyone should go out and pirate my books. BitTorrent is a sign of success. Success for whom? For the author? Not this one. Piracy brings up some foundational issues that make me question the entire idea of ever signing another publishing contract which yields any rights for future electronic distribution.

Publishing is an Exchange: Freedom for Distribution

When I sign a contract, I’m agreeing to let the publisher distribute the book. I agree not to go out and print the book myself or distribute the thing online. Even though, if I published the work online, I’d get ten times the audience, the ability to get real-time feedback on what content works and what content doesn’t, and I’d be able to build a real community around the effort possibly recruiting others to contribute and edit. My books are available on Safari, but it isn’t like I get access to a Google Analytics account that’ll tell me what content works and what doesn’t. O’Reilly has a community site called O’Reilly Network, but it isn’t particularly vibrant, nor does it do a great job of book promotion. (Update: I’m trying as much as I can to change this.) You publish with a publisher more for credibility and distribution than anything else. You give up quite a bit of freedom to distribute the content yourself.

Stopping Piracy is a Fools Errand, but…

When one of my books is pirated, I no longer have any say in the content or format of the book. When the book is pirated, my gut instinct is to send a note to the lawyers and ask them to send a cease and desist order. Nat and Andrew are right, there really is nothing you can do about piracy, you can’t stop Torrents, and unless you work for the RIAA and you are evil, you shouldn’t try. I’m very much aware that piracy increases the distribution of the title, and because I’m not motivated by royalty money, I, like Nat, have a similar view. Piracy ain’t that bad, but if someone out there is free to pirate, and if my publisher isn’t going to go after them aggressively, then, well…. why should I bother continuing to sacrifice my own freedom to distribute the content myself?

Don’t get me wrong, I have no “beef” with Nat or Andrew. I think Andrew is great. He’s is one of the primary reasons why O’Reilly has a reputation for great content. But, Nat’s response is frustrating for this author, because, it is essentially saying…

…Only Authors Have to Respect Publishing Contracts…

As a content creator, it makes me wonder why I don’t just distrbute (pirate) the content myself. It also makes me wonder why O’Reilly doesn’t just give away the content for free in a slightly less produced form. If the presence of a pirated version of the book doesn’t affect sales, then I’d like get in on this and start having some more control over the pirated content. I would love to start distributing some of content I’ve produced online myself. If they can pirate, why not I?

Go back to the idea that you, the author, sign a contract granting O’Reilly exclusive distribution rights. You have to respect that contract, if you didn’t you would be sued by your publisher. You would be in breach of contract. So what happens when your book is past the prime in terms of sales, and you want to update the content…. sorry, you can’t. The book is still under contract.

A Concrete Example: A Dying Book

Let’s take a title like “Jakarta Commons Cookbook”, it was a successful title by my standards, I can’t tell you how many books we sold, but it was enough to generate a flurry of great feedback and some continuing interest in Jakarta Commons. (I think it sold maybe a 6-7 thousand copies, maybe more, I didn’t really keep track.) I wrote this book as a resource for the community, I hoped that it would be the first edition in a line of many editions which would be easy to produce and relatively cheap to edit. I had a picture of a book which would have a multi-year lifetime – an update every two years.

I’ve asked them for permission to update the title, I even have authors lined up who would help with the effort, but it isn’t on O’Reilly’s list for a second edition. Hell, I’d even be interested in coding a Web 2.0 collaborative recipe creation site around the thing. People like Henri have stepped up and volunteered to help/take the thing over. But, every time I’ve asked for permission, I’ve been met with silence. No one has said, “No, we’re not going to update Jakarta Commons Cookbook”, no one has said, “Sure, we’ll update that in a few years”. Meanwhile, technical content becomes less and less relevant every single year, and the effort to update the title increases over time. I understand why they don’t, the sales figures don’t rise to the level of a second edition. Or, at least, that’s my guess.

The problem here is that the book has been pirated already, there is a copy circulating online. If I started getting involved in the piracy of the book myself, I wouldn’t be affecting sales (at least not print sales), but I would be in violation of my contract.


…if O’Reilly Media looks the other way when a book is pirated, then why should I bother honoring the contracts I sign with them for every book they publish? I understand the sentiment, but I’m not particularly excited when I see someone take a liberty which I sacrificed to my publisher. While I’ll admit that piracy can help sales, I’d be happier if I hadn’t yielded online distribution rights to “the pirates”.

I’m not writing this to be difficult, I’m writing this to illustrate the sort of Faustian bargain that writers get into when they cede online distribution rights in an age of piracy.