Fixing Payment Systems with Competition

This Target hack is a BFD. I’m at the mall this weekend because I’m a very last-minute shopper and it was the only time I could find to shop. My wife calls me because she gets this email from Chase which I’ll paraphrase here:

You got hacked.  Lolz!  It ain’t our fault, really.  So sorry. So so sorry. Oh, BTW we’re putting new limits on how you can use your card in the middle of Christmas week because of Target. Hey hope this doesn’t screw you up, but I hope you weren’t planning on spending more than $100 a day with us.   Happy holidays.

Think about this for longer than a few minutes, think about how this affects millions of customers, and then you’ll realize that this Target hack could potentially ding a percent or two off of this holiday season for a few retailers.

When we look back at this time, we’re going to laugh at how silly our approach to payment systems was from about 1980 – 2013.  I think that the Target hack is likely just the beginning, but it is clear that (even with strict PCI-compliance) we need a radical change in payment.

Problems with Payment

  1. Our credit cards (at least in the US) are the technology equivalent of a cassette tape. While I’m running around town with a smartphone that can read my fingerprint whenever I shop, I’m still using the equivalent of an 8-track cassette tape to pay for everything. Instead of moving toward a system that uses my location and my fingerprint. We’re just walking around with wallets that are no more secure than an envelope labeled “My Credit Card Numbers” that is totally unprotected. Steal my wallet, and you’ve got my credit card numbers… there’s a better way.
  2. We still have this irrational belief in the signature (and checkout clerks still eyeball them). This is our idea of identity verification – here’s a quill pen, why don’t you just sign this.  Now wait… there’s enough reliable location data flowing from my phone to enable every checkout clerk to say, “Welcome to the store Mr. O’Brien” without me saying anything.  The store should know I’m there already, the technology also exists to have the store take care of payment authorization every time I pick something up. My phone could generate a piece of data that could encrypt not just who I am, but where I’ve been today and what the time is down to the microsecond authenticated by several GPS satellites.
  3. Online payment systems that offer more security are tiny in comparison to the 50,000 lbs gorillas that dominate the system.  No one uses these systems. Add up the value of all the innovative payment companies in the Bay Area (Square, PayPal, + a thousand others), and you still don’t touch the $6.9 trillion total volume of Visa.  That’s $6.9 trillion dollars flowing through billions of point-of-sale terminals (or “all the money”). Someone needs to figure out how to upgrade that instead of creating yet another payment system to trial in San Francisco and New York.

When I wrote about payment systems in 2010, the universal warning everyone was throwing at me was, “Don’t expect anything to change in the short-term.  The retail industry moves slowly, and no one wants to make the capital investment necessary to upgrade point-of-sale.”  At the time I was talking to a senior manager at a well-known payment company based in the Bay Area about NFC payment systems.  According to him, the future was now a revolution was upon us.  It wasn’t.

The solution

1. Ensure real competition in the payment processing space. Huge payment providers like the ones that have logos in your wallet have had a history of using confidentiality agreements with vendors and transaction fees as a tool to lock out the competition. For example, you are not allowed to offer discounts for different kinds of payment methods.  Whether or not this continues to happen after the interchange fee settlement is up for debate, but we need to make sure that new technologies are not locked out of the physical point-of-sale space.

2. Put all the risk on payment providers.  If you provide a card or technology that people can use for payment, put all of the responsibility for a compromise on the payment provider. This will motivate payment providers to move away from the current, insecure methods of payment that we use today. Your credit card won’t just be a series of easy to copy numbers, it will make use of the technology we have available. Also, this would force dramatic changes to PCI.  “Storing a credit card #” at a merchant would go away, and instead your transactions would look more like PayPal’s authorization process for recurring payments.

With real competition, the payment processors that can control risk will be able to offer a significantly lower cost to the retailer, and retailers will provide the necessary motivation to consumers to adopt the more secure technology.  If Square has the best risk management and fraud prevention technology available, a retailer should be able to offer customers that use that technology a 1-2% discount if they pay with Square. Competition (not regulation) is the way out of this mess.

Whirr

Whirr + Spot Prices + Thanksgiving Weekend means that I can run large m1.xlarge instances on the cheap.

<griping>Also, Whirr is essential, but the project has a sort of “forgotten Maven site” feel about it. It’s annoying when an open source project has several releases, but no one bothers to republish the site.  It’s even more annoying when the “Whirr in 5 minutes” tutorial takes 60 minutes because it doesn’t work.</griping>

(An imperfect) Space-inspired OSS Project Analogy

At the risk of sounding like a raving lunatic, I decided to come up with a space-inspired taxonomy for characterizing OSS projects.  I came up with this after kicking around GitHub over the weekend trying to make sense of some new projects. Recently there’s been a huge influx of corporate-sponsored OSS projects that are released with a lot of fanfare.  While there’s a lot of good stuff happening in OSS land, it is also difficult to figure out which projects are truly “vibrant” open source projects and which are simply one engineer’s solo project.  While GitHub makes it easy to track things like a network, number of forks, etc.  These metrics are still something of a popularity contest. When a company puts 40 projects in a GitHub account, what I’d appreciate is some upfront statement: “There are our four major OSS projects, and the rest of the repositories with silly names are just small projects that plugin into our own infrastructure.”

As an exercise in lunacy, I decided to throw together a very flawed OSS project size/health/community analogy using space.   You can classify OSS projects using the following classifications:

Comets: Periodic Celebrities / OSS Projects that Won’t Last

Maybe there is an open source project that is suddenly very hot, but you can tell it’s not going to last very long.  I compare these projects to Comets.  The latest little Javascript utility may streak through industry news for a few weeks and then fizzle out.  Many startups in the OSS space think of themselves as a new planet, when in reality they are just a comet with excessive mass.   The thing about the OSS industry news cycle is that Comets often tend to dominate the news cycle as if they are full-fledged planets (because you can pay for coverage).   We’re all so used to the planets we already know… so when a comet comes into town everyone flips out.

Some comets burn out, some comets show up every few months or years and make a lot of noise attracting attention and contributors but ultimately returning back to the desolate reaches of the Oort cloud for a few months.  I’d name some OSS comets but then this post would attract a whole army of comment haters. If you find yourself attracting a community, losing a community, then attracting a new community, then losing it – you are in a dangerous orbit and you are a comet.

Asteroids: Where is everybody? Who’s running this project?

One person OSS projects.  Projects that are not completely connected to a community.   Projects that look substantial on radar, but then appear to be abandoned upon closer inspection.  Projects not large enough to attract a community (or in this case an atmosphere).   The majority of GitHub is comprised of a series of asteroid belts.

Think RubyGems, some RubyGems are so important they are moons of a planet (activesupport), or even planets in a system depending on your perspective (rails), but a lot of RubyGems are just one-person forks of someone else’s codebase floating around without a lot of discussion.   If you’ve ever found yourself trying to contribute to an OSS project only to find no response, there’s a good chance that you’ve stumbled upon an asteroid.

If you work for a company that just dumps OSS projects out there but doesn’t provide much in the way of support, you are effectively generating more asteroids.  Asteroids can be very useful to a consumer of OSS, but when you take on an asteroid, when you start mining that asteroid for minerals, you own the whole thing. If it breaks you have to fix it.  Also, if your healthy project (your planet) depends on an asteroid, you better keep track of it, or it’s going to impact you at some later date.

Planets: OSS Projects with an Ecosystem

Tomcat is a planet, and on the Tomcat planet live thousands of developers.  If something starts going wrong with Tomcat, the planet, a whole army of people show up to fix problems.   If someone wants to do something drastic to the planet, there’s a whole community which consists of that planet and any associated moons that show up and register an opinion. Taking care of a planet is tough work because there are so many interested parties.

This is the ideal size and scope for an OSS project.  Something large enough to attract a population, something large enough to sustain an atmosphere.  Yes, your planet is going to go through seasons of activity and inactivity, but there will always be signs of life on your project (as long as you do things like monitor the climate and make the necessary adjustments).

Moons: Your planet’s plugins.

Plugins for larger projects are moons.  Maybe.  Moons can gain so much velocity that they need to be rocketed into separate planetary orbits.    Maven plugins == moons.  Gradle plugins == moons.  Can’t think of anything more interesting to say about moons, so I’m moving on…

Systems:  Substantial OSS Projects Revolving Around a Central Idea or Project

Apache httpd is a system (maybe), Rails is a system (but it dominates the Ruby Galaxy).    Node.js is a system in the Javascript galaxy.

While Hadoop itself may have been a planet at one time, you can consider the entire Hadoop ecosystem to be it’s own system.    Or, maybe Hadoop is a planet in the Map/Reduce system.  Maybe Hadoop started out as a planet, it quickly aggregates many moons.  It underwent a sort of ignition point and became a star itself?

This may be where the whole analogy breaks down because if Hadoop is a star, what then is Hive?  A planet? You know what, I don’t know. It’s an analogy and it’s imperfect.  Maybe HDFS is like a singularity that tunnels between dimensions.

Now I’m just being facetious.  You get the gist.

Galaxies –  Galaxies are often more than just a project, they are an entire collection of systems.  For example, Hadoop is in the Java galaxy.    Maybe there is a PHP galaxy or a Javascript galaxy.

Listen and you’ll hear Cosmic Background Radiation?  That’s the constant bickering between proponents of BSD-style licenses and proponents of the GPL.

What is Dark Matter? Some people are convinced that OSS is dominated by corporate influence.  This influence is often very visible, but it is also something that is difficult to keep of track of because it has a weak interaction with mailing lists.

What then is the Apache Software Foundation?   The Apache Software Foundation is like the Federation.  It spans many systems and dominates certain galaxies.  Except they often have a hard time deciding where to go next because none of the ships have a captain. Sulu can stand up at any time and say, “Kirk I’m going to have to -1 that order.”  (That was a joke ASF people… that was a joke.)

Here, watch a YouTube video of Carl Sagan…

A Web Developer from 2001 Wouldn’t Even Recognize this World

I work with people much younger that I, but the reality I’m discussing in this article is really just 12 years ago. It feels like another era entirely. This is especially true if you develop anything that touches the web.

When I started my career it was all about web applications that involved full round-trips to a server. You had a browser (or a WAP phone), your browser makes a request for a web page, waits a few seconds (a few seconds!) and you get a fully assembled HTML page in return. It didn’t matter because the Web was still so full of novelty we were just happy enough to be able to do things like read the news online. Maybe your local newspaper had a website, most likely they didn’t. There was no YouTube. Web pages weren’t really connected together in the way they are now. Back then it wasn’t like loading TheStreet.com required a bunch of asynchronous calls out to social networks to populate Like buttons – there was no social network. It was just HTML and Images, and it took forever. It was fine.

My first two jobs were developing an in-house cross promotional tool for an online gaming company named Kesmai in Charlottesville in 1997, and then I moved to New York to work for TheStreet.com in 1999. Web “applications” at that time were just an inch beyond putting some scripts in cgi-bin. At Kesmai it was Perl-based CGI scripts. Between Kesmai and TheStreet I was working on systems that used a proprietary Netscape server product. And, at TheStreet.com we were using Dynamo behind Apache, so we had JHTML and Droplets and that was my first encounter with a site that had to scale. We had a TV show on Fox and maybe something like 600-700 people could use the site at the same time. (Again, that was huge back then, how times have changed?) Everything was template-based, servlets were around, maybe, but I don’t really remember diving into the Servlet API and JSPs until Struts came along maybe in 2001.

Back then, companies like Forbes.com, which I moved to after TheStreet.com, invested a crazy amount of money in hardware infrastructure. There was still a lot of proprietary software involved in the core of a web site – expensive CMS systems, etc. Open source was around, yes – we ran Apache, but it isn’t like it is now. You likely paid a hefty sum of money for a large portion of your production stack. Around 2001 and 2002, a small group of people were starting to focus on speed, and the way you achieved speed at scale back then? Drop a few million on a couple of big Sun servers. It worked. It seems old-fashioned now, but as a developer I’d work with the operations team (then as now, the operations team didn’t know much about Java), and you’d help them size heaps and figure out how to make the best use of 64 GB of RAM on a E450. You’d run multiple JVMs on the thing, someone might install something like Squid to cache things at the web layer.

Back then, you could touch the servers. They were shipped to your offices. Companies like Sun and SGI invested a lot of money to make servers look all fancy. These things were purple they had blue LEDs (remember high-brightness blue LEDs were, at one point, really new to us all). I remember seeing advertisements for servers in programming magazines. Now if you look back at these, it’s as strange as seeing an advertisement for a station wagon in a National Geographic from 1985. These days, I don’t even know who makes the hardware that runs the sites I work on, and with the exception of the database server, I don’t even care if it is even a physical machine. Back then it was like everybody getting all excited about the E4500 that was in the foosball room.

There was no memcache, there was no AJAX, there was no AngularJS, there was no REST, SOAP was new and you probably didn’t use it yet, there was no Google Analytics, remember, Google was still a tiny startup at the time. I remember having a discussion about Ruby in 2001 with a colleague who was excited by it, but Rails didn’t exist yet. Perl and PHP were around, they’ve been around forever, but you really weren’t confronted with systems that spanned several languages. Javascript was around, but you probably weren’t going to dare use it because it wasn’t like there were any standards between browsers. HTML5, huh? This was back when people still used the blink tag. Need to crunch a huge amount of data: well first of all, huge is relative and you didn’t have things like Hadoop. Just didn’t exist yet. Big Data? Yeah, back then if you had 5-10 GBs you were running a huge database. Huge. XML was still a really good idea. Flash was about to really take off.

If we could travel back in time and snatch a web developer from 2001 and drop them into 2013, they’d flip out. They’d look at your production network and wonder what happened. We’d have to tell them things like, “you’ve missed a bunch of things, this kid at Harvard created a system to keep track of friends in PHP and that changed everything. Google now runs the world. Also, the site ‘Dancing Hamster’ isn’t what it used to be.”

I look at people that started working in 2007 or 2008 and I think about how strange it is that none of this is new – because I’m still living in 1993. I’m still amazed at the functionality of Gopher in 1993.

And you can thank Mark Smith for this YouTube video…

How Java Programmers “Feel” in 2013

My summary of general Java sentiment after attending JavaOne 2013.

“Everyone’s all excited that Java didn’t die. Yay! We made it!”

Ok, that’s not fair, how about:

“Everyone’s excited that Java has new found energy and that Twitter ultimately had to migrate everything to Java. I mean Twitter is using Java! The fact that twenty-somethings are using the platform makes us feel a lot less old. Thanks.”

Ok, let’s try again…

“Everyone is excited that the Ruby on Rails kids are playing defense, that Oracle is paying a lot of attention to Java, and Java EE officially doesn’t suck anymore. Let’s go.”

I’m going to go with that last option.  While it is true that Oracle’s taking the platform in an incrementally more commercial direction (you can’t get 1.6.0_51 without a subscription and there are some new tools only available to paying subscribers), the platform does appear to be healthier than ever.   My own theory is that Oracle is doing a much better job enabling people like Rheinhold and Gupta to innovate than Sun Microsystems ever could have.   There was a lot of pain between the Setting Sun and this New Java Renaissance, but we’re here.

After several industry luminaries predicted the death of Java, we’re still here and not only that, we’re still innovating.  We’re beyond that difficult period of ex-Sun employees griping about Oracle’s takeover and we’re now moving on to things like Java 8.  I wouldn’t have said this four years ago, but I’m generally optimistic about Java as a language and a platform.

My First 7th, 8th, and 9th Tweets Ever – Still Somewhat Accurate

Now that Twitter is going public, I was wondering what my first few tweets were in April 2007.  Some of them were funny.  Like my 6th tweets is: “Abandoning Continuum – Maven Blows” and my 14th Tweet which was: “OMG, like, using Twitter I can keep up with all my coolest friendz.   This is so totally awesome!  Twitter you are my hero”

But, my 7th through 9th tweets 

2007-04-23 twitter is for jerks
2007-04-23 twitter is for the hundred or so people who revolve around the SFBay area and give a crap
2007-04-23 oh twitter, can’t get enough of twitter, blahblahblah – twitter is a web20 dream….

With all the publicity that Twitter is getting this week I’m coming back to the same sentiment.  Twitter, while useful and interesting, has overplayed its hand of late.  It’s a great marketing tool, don’t get me wrong, but it’s also an echo-chamber of super Narcissists (myself included, I mean I do have a blog, don’t I?) 

GigaOm’s SQL on Hadoop Report…

GigaOM released a report about “SQL-on-Hadoop” which talks about some of the trends I discussed at Strata last month.  They have a completely different perspective on the problem, I think one that is mostly informed by vendors in the topic area, namely Cloudera and the work being done on Impala. They do mention a few Apache projects, and some of those “risky startups”.

Was it sponsored by Cloudera?   Not sure, but if you do an exact search for the phrase “MapReduce does not gracefully handle many concurrent requests”, you’ll get a link to the PDF on the Cloudera site.   Clearly, they liked what they read enough to make a paid report freely available on the website.   Seems to suggest some relationship, no?   …there is nothing wrong with this, a very common practice for content marketing is to approach an analyst firm and sponsor content (and then ask them to call the competition “risky startups”).   Was it sponsored?  I don’t know.

Also, what I love about analyst reports are statements that take no risk and make no predictions.  Here’s an example:

“Their technology bets are risky but, if obviously superior, could allow them to stage a market coup.”

That’s as bold as saying, “Some companies are proposing innovative solutions which, if successful, could change everything.”  How does this stuff get approved by an editor?

That being said, maybe they are right? Those startups are risky: Drawn-to-Scale ist kaput.

Don’t give me this “We Used to Do Big Things” Crap. We do.

ImageI was having a discussion yesterday with someone who decried our country’s lack a manned space program. I agree, the fact that the US doesn’t have a manned spaceflight program is a step back. But then he said, “our generation hasn’t done anything like land someone on the Moon… we used to do big things.”   I looked at him funny because…

…he’s just plain wrong.  The Linux kernel trumps the moonshot both in terms of engineering effort and societal impact by a few orders of magnitude.  The kernel is the largest, most complex collaborative effort in the history of the species. That may sound somewhat grandiose, but it’s very much true. The Linux kernel is over 17 million lines of code and is growing at an average rate of 3,500 lines per day. Nearly 1,300 developers contribute to Linux with versions like 2.6.25 generating more than 12,000 patches. The Linux kernel powers over 93% of the TOP500 Supercomputers. The kernel is at the heart of Android which has a nearly 60% share of the mobile operating system market with 1.5 million device activations a day. The kernel also powers millions of servers across companies that have transformed the way we consume information and communicate with one another such as Yahoo, Google, Facebook, and Twitter.

This is a silly game show to play, but if I had to choose between a Saturn V with three dudes and a lunar landing module strapped to the top of it and a DVD with the Linux kernel, I’d point to the DVD as having a bigger impact. I’d also argue that the benefits of today’s innovations are more equitably distributed and are not driven by global nuclear conflict.  That’s a good thing. Instead of generating dusty artifacts for museums in DC we’re busy creating software that people can use. I’ve singled out the kernel because I’m convinced it’s at the center of several transformative shifts.  

It is also the operating system that will send us back to the Moon (on a private, SpaceX rocket).  Here’s a quote from Robert Rose at this year’s Embedded Linux Conference:

Linux is used for everything at SpaceX. The Falcon, Dragon, and Grasshopper vehicles use it for flight control, the ground stations run Linux, as do the developers’ desktops. SpaceX is “Linux, Linux, Linux”, he said.

We still do big things.  So keep your sappy 60s sentimentalism to yourself and submit a patch.

How much faster is EC2 High I/O storage?

I was running some simple experiments with sysbench on EC2 images.   I wanted to see how fast the new EBS 1000 IOPS instance was vs. the ephemeral storage (SSDs) on a High I/O instance.

The process was simple:  Fire up Ubuntu 12, install sysbench, mkfs.ext3 on the volume, run sysbench prepare, and then fire off a series of rndrw tests with sysbench with # threads from 2^1 to 2^(whenever it broke).   Here’s the interesting graph.   The yellow line is the IO performance of /dev/xvdb on a h1.4xlarge and the blue line is IO performance (rndrw) for a high performance EBS volume @ 1000 IOPS.

Testing SSD storage on EC2 versus a high performance EBS volume (1000 IOPS)
Testing SSD storage on EC2 versus a high performance EBS volume (1000 IOPS)

I’m sure there’s some magic tweak I could have used in /etc/fstab to improve performance, but from what I can see the answer to this question is that the highIO instance buys you about an order of magnitude improvement.