Monday, November 5, 2012

Lies, Damned Lies, and Statistics: 7 Ways to Improve Reception of Your Data

* Now with a bonus 8th suggestion! Thanks to Stephen Alexander.

"There are three kinds of lies: lies, damned lies and statistics." I hate this quote almost as much as "first thing, kill all the lawyers", and for essentially the same reason. They are both applied wildly out of context, to the point that the meaning assigned to them are almost the exact opposite as the original quote intended. Shakespeare was not talking about a hatred for lawyers. His character, the comic-villain Dick The Butcher, was talking about how a world without lawyers would be a great way to start the utopia dreamed of by murderers and thieves. Shakespeare was not espousing the virtues of lawyers, as some have attributed; but he certainly wasn't saying that killing all the lawyers would be good for humanity either. There's nuance in the comedic moment.

The same is true for the "lies" quote. It is from a Mark Twain article (later included with a series of articles to form a book), in which he was supposedly quoting former British Prime Minister Benjamin Disraeli. There is much debate over who really originated the saying, but that's not my concern here. What bothers me is the haphazard application of the saying, as if it is sufficient proof to indicate the unworthiness of data-based decision making. Of course, all decisions are based on data. Even a hunch is, essentially, a data point.

Even if it bothers me, we live in a world where statistical data is both revered and disdained. If the data supports your idea, it's great! If it doesn't, it is suspicious. In the data-obsessed world of IT Service Management, of which I am one of the chief obsessors, we need to keep some perspective when it comes to statistical information. This became relevant to me one day as decisions were being made around me that were based not on statistical data, but on a series of anecdotes instead. It was assumed that, because the anecdotes appeared to contain similar themes, we could/should make high-impact decisions based on them. The statistical data was presumed to be irrelevant due to the fact that the anecdotes indicated that the data was missing critical information.

It hit me that I was not entirely right, and the others were not entirely wrong. There is significant nuance involved, where the two types of data (statistical and anecdotal) are both needed. That led me to consider some suggestions regarding how we position "data" in the context of ITSM decision making.
  1. Decisions are based on data. All conclusions are inherently based on data of some sort, some qualitative and some quantitative. In the absence of trusted, useful statistical data, decisions will be based on anecdotes, whether or not they represent truth.
  2. Statistical data must move from an untrustworthy state to a trustworthy state. It cannot and will not be used for decision making until the decision-maker trusts the data. We cannot assume that because we have numbers that the intended audience for the numbers will believe them.
  3. Don't get bent out of shape when your numbers are not immediately received as Truth.
  4. Presenting data consistently is far more important than the precision of the data. Be persistent and consistent in how you present and interpret data. This cannot be stressed enough. There is no such thing as the perfect data, so stop looking for it. I used to constantly change the data I presented, hoping that the "next version" would catch on, that everyone else would suddenly get it. The opposite is true. When the data presented is changing all the time, you come across as someone with something to hide. Your credibility is shot.
  5. Find out how your data is being received. "Build it and they will come" does not apply to metrics. Ask intended recipients what they think of the data as presented. Is there a way to make it more clear? Are there concerns about data accuracy?
  6. Your data presentation must be actionable. Take action on your data, and teach others how to take action based on the data. If the information is not actionable, it will be ignored and mistrusted.
  7. Anecdotes provide great information. Complaints are amazing opportunities to focus your data queries. For example, I found that not all requests were coming through the Help Desk, so the data regarding quality of Help Desk service was not complete.  Before I could make decisions based on the data as captured, I had to understand why requests were not coming through the Help Desk. Of course, you also need to find out whether that is a good or bad thing, but that's another discussion.
  8. * Your data must be relevant to the recipient and the context. Often overlooked, but essential to the credibility of your data. Before publishing or presenting your data, make sure you can answer this important question: Why will my intended audience care or find this relevant? If you don't have a clear and concise answer (no more than one brief sentence), your data is probably not ready for consumption.
What would you add to the list?


  1. I suppose this goes along with "actionable" but the data (or, hopefully, since this is in a presentation it has (or is becoming) "information") is relevant to the audience.

    Meaning, perhaps the data (or information) is actionable but not to who is receiving the presentation or report.

    Another take could be that the data is actionable but the action doesn't need to be taken immediately, say, Capacity is going to be maxed out in....5 months - sure, there is an action (or set of actions) but it is a bit further off in the distance. Doubtlessly in certain circumstances it is a good idea to talk about things that need to be done in 3 months (or longer) down the road - however, it may be that the presentation is a weekly "Operational Status" meeting - is that type of data/information relevant to that meeting? I would think something that deals with more immediate/pressing needs should be discussed.

    So - "relevance" - both in audience and setting.

  2. With flawless systematic apparatuses, they can augment representative efficiency, streamlines the procedure, creates and examines different techniques and furthermore predicts the requirements and practices of clients. Data Analytics Courses

  3. Thankyou for this wondrous post, I am happy I watched this site on yippee. ExcelR Data Analytics Courses In Pune

  4. This post is very simple to read and appreciate without leaving any details out. Great work!
    data science training in hyderabad