Super Fun Time at O’Hare (or You don’t understand, it’s a tall thing…)

I have odd traveling habits that I’ve explored in previous posts.  For those of you unfamiliar, I’ll summarize as succinctly as possible: I hate air travel. Every time I travel, it is as if the world conspires to provide me with more evidence that I’m on to something – that air travel hates me back.

Last night it was no different as I attempted to make my way from Chicago to Antwerp for a Java conference – Devoxx ’12.   In this story, I fight for my Irish heritage, I stand up for tall people, and I witness legal drama up close and personal.   All without ever leaving Chicago.

First, it was the cab ride.   Every cab I take from Evanston to O’Hare has the same properties:  1. It smells awful, just awful, I don’t even want to think about what it smells like, 2. The cab is full religious talismans and mystic amulets as if the driver relies more on prayer than on regular automotive maintenance, 3. The driver consistently takes insane risks (driving 95 mph on the Kennedy or running a train signal), 4. The check engine light is on, always, and 5. The car never really stops at a red light, it glides because the tires are bald.

Right, so flying already “activates me”, I’m already a mess because I’ve been  thinking about how much I hate flying for a few days, and the ride to the airport is statistically two billion trillion times more dangerous.   “Oh, wow, you handled that skid quite well.” or “No, really, slow down, I’m in no rush to get to the airport. Thanks.”   There is an upside, every time I successfully reach the airport, I feel invigorated.  On a rational level I know I just survived the riskiest part of my trip. Continue reading

The problem with “retargeting”….

Retargeting (some people call it remarketing) you might not know what it is.  Here’s a definition for those of you who haven’t worked in the advertising industry…

The internet knows everything about you. Actually, the internet knows more about you than you know about yourself.  Do you look at the news at 7 PM on a Saturday and then swiftly transition to the ESPN site to check football scores? Guess what?  The Internet already knew that because you’ve been tagged, tracked, and monitored for years.

The Internet knows you so well it can predict what you are going to do. There are several companies selling APIs that advertisers can call out to that predict what sort of website you are going to be looking at in 20 minutes.   Or what your income is?  Or how likely you are to purchase a home in the next six months?   Every time you browse the web, there’s a crazy amount of data flowing back and forth around several advertising exchanges about you, what you are doing, what you are going to be doing, and who you are.     Advertising on the web is about modeling your behavior, predicting it, and then refining those predictions based on history.

“The Internet understands that right after you read this blog entry, you are going to check Facebook on your phone, then you might decide to watch a movie on Netflix. You and the billions of other users on the web right now are predictable, and advertising companies plug your data into these amazingly complex models that pull in awe-inspiring amounts of data at speeds that would seriously freak you out.   All this Big Data stuff we’re talking about – advertising drove that because you leave a crazy data trail everywhere you go.

Back to this retargeting idea, this is how it works… those shoes you just looked at. You know the ones you just dropped in a shopping cart and got so, so close to buying. Close enough that the site even noticed that you entered 10 digits of your 16 digit credit card before you realized that there was absolutely no space in the budget for extras… Well, those shoes are going to follow you around the web for the next few hours reminding you of the potential purchase.   That retailer is hoping that you’ll reconsider the purchase again (and again, you’ll see those shoes 10 times that day, if you pay close attention).  Your recent browsing history will haunt you on the web, and this is retargeting.

Companies pay a real premium for retargeting, and it works.  I’ve seen the results of spending money on retargeting.    There’s something to be said for having the ability to reel people back into a site.     But, there’s a problem…  I keep on seeing retargeting ads targeted at me that make no sense at all.

Here’s a sample of the problems I experience with ads “retargeted” at me:

  • You don’t know me very well, do you?  An advertisement for a Young and Christian singles site haunts my browsing experience because of retargeting.  Every time I see this ad, I wonder what sort of crazy algorithmic edge case I’ve uncovered in AdRoll.  If you know me for longer than 60 minutes, you’ll realize just how badly targeted this advertisement is.   First, I’m married, and second, let’s just say I’d have little use for a site like this.   Someone somewhere is paying some nominal fee for AdRoll to display this ad to me.
  • I’m already a customer, stop retargeting advertisements to me.  I’m confronted with advertisements for companies I’ve already purchased from – this one is the most noticeable.  For example, I’m a customer of SEOmoz – I pay SEOmoz money every single month already.   Yet, the SEOmoz retargeting ad follows me around the web every time I use the site.   The same true for Allstate – I visit the Allstate web site and pay a bill, all of a sudden Allstate is trying to sign me up as a customer.    This strikes me as a massive waste of money for these companies.
  • Political advertisements that understand what news I just read.   This is the creepiest one of them all.  I’ll read a news story about unemployment, and without fail Romney and Obama are retargeting advertisements to me that talk about unemployment.   There’s something strange about the fact that a campaign can now have access (albeit indirect) to this vast amount of data on preferences and browsing history.   It is at the same time both a powerful political outreach tool and the most Orwellian approach to campaigning you could think of.  It reminds me of a picture of a sign I saw on Reddit today.

If I’m seeing problems in the algorithms that target ads to me, I wonder how widespread these issue are.  I know retargeting works, I’m not questioning the value of the practice, but I wonder how good these models really are.   Maybe, just maybe, the internet knows me better than I know myself… I don’t think so.

A real interaction with Comcast

analyst Shivangi has entered room

Shivangi: Hello TIM, Thank you for contacting Comcast Live Chat Support. My name is Shivangi. Please give me one moment to review your information.

Shivangi: Hi, I would be more than happy to assist you in completing your order. How are you doing today?

TIM: good

Shivangi: I am glad to know that you are doing good.

Shivangi: As I can see that you would like to sign up for Digital Starter. Am I correct?

TIM: yes

Shivangi: Thank you

Shivangi: Are you an existing customer?/

TIM: yes, business class

TIM: what’s next?

Shivangi: Thank you

Shivangi: Thank you for being our valuable customer.

TIM: Ok, really, I’m falling asleep, you’ve thanked me quite a bit, but can I get my service turned on

Shivangi: Thank you

TIM: Ok, was that sarcastic?

TIM: Cause I’m trying to get my service turned on

Shivangi: I mean earlier that thank you as because you thanked me.

TIM: What is going on here? What do I need to do?

Shivangi: I would like to inform you that as it is the business class so you need to call our toll free number 1-800-934-6489.

TIM: What?

Shivangi: I really apologize for the inconvenience caused to you.

TIM: So, I filled out all these forms for nothing

TIM: Just to be thanked repeatedly

Shivangi: I understand, Tim.

TIM: Well thanks for the thanks

Where’s my 20 EB Holographic Disk Tube (or WTF is up with Storage)

Maybe my appetite for disk is outpacing the industry, but shopping for external storage is frustrating. First, let me frame the problem. I have little faith in hard drives these days. Traditional hard drives are made of platten that rotate at 7200 RPM very close to a tiny arm with a magnetic reader on the end. Yes, we’ve perfected this technology to the point where we can all rely on this thoroughly impractical idea, but it also means that once every year or two, you’ll have a drive that suffers mechanical failure or just stops working.

SSDs are a better idea, but they still seem too new to rely on. They are also still far too expensive to consume. Call me cheap, but I’m not ready to fork over the money for SSDs, and everyone I know who has had an SSD-based laptop has suffered a debilitating one week disk failure.

The problem is that Hard Drives die, and my hard drives tend to expire about once a year. Maybe I use them too much, maybe I have dirty power, I don’t know. What I do know is that the technology is an embarrassing mess. For all the reported elegance of the Apple products that are in my life, I still find myself spending way too much time assessing the current state of storage technology when I need to purchase an external hard drive.

And, I’m also surprised that capacity hasn’t seemed to keep up with demands. I’m sure most people see a 750 GB MacBook Pro and they think, “That’s a lot of storage, there’s no way I’d ever fill that up.” Since I record a lot of audio and produce a fair amount of video, I have an entirely different view. I see 750 GB and I think… I’ll fill that up in a few months.

Three problems: I eat through storage like I’m doing video production (because I’m doing video production), I’m wary of disk drives so I have to backup everything like I have OCD (because I have OCD), and I’m a cheap bastard so I don’t yet own a RAID 5 20 TB storage device. But, I’m also waiting for something bigger…

Years ago I was hearing about research into holographic storage technologies that promised to trigger a revolution in the space, but I’m still seeing incremental improvements. Every time I go to Best Buy, I always make sure to ask if they have any holographic 20 Exabyte disk tube device. They usually don’t understand.

John Yeary’s Documentation Advice triggers a Documentation Flashback

John Yeary wrote some useful suggestions for code documentation review: over on his blog. He has reasonable expectations for a Java developer, I think more people should write code level documentation, but unless you are shipping a public API of some sort, the reality of the industry is that no one writes documentation – ever.

Which is bad, you should write documentation. At least you should make an attempt at writing documentation before your project manager glares at you and tells you to get back to implementing requirements.

Anyway, it reminded me of a former employer that had a very curious approach to documentation. We weren’t allowed to write any. On top of not being allowed to write documentation, all method and variable names had to be descriptive. The first time the development lead had to talk to me about this, I seriously thought he was joking:

Him: “Ok, Tim. Um, that work you did yesterday, it’s great stuff, but you did something I’d like you to stop doing.”

Tim: “What you don’t want me to use JdbcTemplate?”

Him: “No, that’s not it… it’s the documentation. Our process doesn’t allow documentation, you need to make the code more descriptive.”

Tim: “That algorithm is complex, it’s recursive and a bit tricky. I just put a note in there to make sure that…”

Him: “Use a different algorithm, we can’t depend on it. All code has to be self-documenting.”

Tim: “Ok, sure, I’ll make the code self documentating boss.”

At University, I had a double major in Electrical Engineering and Sarcasm. A week passes, and all of my code turns into this:

public Patient loadsMedicalRecordAndUsesHibernateInsteadOfSQL( int patientId ) {
  HibernateSession session = loadHibernateSessionFromThreadLocal();
  Patient patient = session.load( Patient.class, patientId );
  return patient;
}

public static int calculateVisitsUsingAlgorithmToAccountForMultipleRecordSystems( Patient p, Hospital h, int visits ) {
  int numberOfVisitsCumulative = visits;
  numberOfVisitsCumulative +=  findNumberOfVisitsInHospitalRecordSystem( p );
  if( h.hasRelatedInsitutions() ) {
    int counterForRelatedInstitutions = 0;
    for( counterForRelatedInstitutions = 0; 
         counterForRelatedInsitutions < h.getRelatedInstitutions().size(); 
         counterForRelatedInsitutions++ ) {
     
      Hospital relatedInstitution = h.getRelatedInsitutions().get( counterForRelatedInstitutions );
      numberOfVisitsCumulative =  
        calculateVisitsUsingAlgorithmToAccountForMultipleRecordSystems( p, 
              relatedInstitution, numberOfVisitsCumulative );
    }
  }

  return numberOfVisitsCumulative; 

  }
}

Just imagine pages and pages of methods with names like “retrievingExactImmunizationScheduleWhileAccountingForNationalHolidaysInPatientLocale” or “calculatingBloodPressureUsingMeasurementsFromLastVisitToRelatedInstitution”.

What surprised me is that my boss liked what he saw. I turned my code into self-documenting code by taking the documentation and putting into all the class, method, and variable names. It was paralyzing for the team, really. My sarcastic attempt to prove a point turned into policy, or sorry “callMySarcasticAttemptAtTryingToProveAPointTurnedIntoPolicy”.

Kids, don’t do what I did, remember your keyphrase

I’ve have a long, sordid history with PGP, and the problem is that anyone searching for my key hits this wall of old keys.  This is what happens when you lose your keyphrase over and over again, you end up with a trail of PGP mess that you are unable to revoke.   Don’t like this happen to you kids, use a password wallet.

The problem with not remembering your PGP keyphrase is that you leave a trail of PGP mess.

An Anti-pattern + A Rant: Maven Relative Sibling Modules

LINK: An Anti-pattern + A Rant: Maven Relative Sibling Modules

I see a lot of people’s builds.   I’m often paid to parachute into a place and fix someone’s awful misinterpretation of Maven.  Being “A Build Fixer” means that you often get the systems that drove others to give up and move on.   One of the worst problems to clean up after is the Maven build that uses modules referencing relative directories.   This Github project is an anti-pattern example alongside a longish rant about a practice that often turns into a disaster to try to clean up.

An old CERN article of mine, translated to Finnish

An old CERN article of mine, translated to Finnish

Oskari Laine in Helsinki, Finland reached out to me and asked if he could translate an old article of mine to Finnish.  This is the result.   I can’t read Finnish myself, but I did sing a piece by Rautavarra last year so the only thing I could do with this is sing it.   But, if you read Finnish, here are the goods.

Data’s Next Steps: An Interview with Steve O’Grady of RedMonk

Data’s Next Steps: An Interview with Steve O’Grady of RedMonk

This is an interview from February with Steve O’Grady of RedMonk.   Steve spends a significant portion of his day talking to vendors and open source developers in the Big Data space, he also talks with a number of companies trying to deal with fantastically large data sets.   In this short interview I ask him to give me a sense of the state of Big Data in 2012.