<img height="1" width="1" style="display:none" src="https://www.facebook.com/tr?id=353110511707231&amp;ev=PageView&amp;noscript=1">

Downhill Skiing and the Reliability of Measurement

March 11, 2014

Maybe this blog is a week or two late, but I can’t stop thinking about this topic.  I, like many at Select International, was sucked into the recent winter Olympics in Sochi, Russia.  Every day in our office, I would have discussions with someone about the USA medal count, how the athletes performed, or Bob Costas’ conjunctivitis.  I was enthralled.  I am not ashamed to admit that I like curling.  I even enjoyed tuning in everyday to ice dancing to see what outrageous ”get-up” Johnny Weir would be wearing.

153791742But aside from the pageantry, the interesting cultural differences, and the unique nature of Olympic athletic competition; I realized while watching the women’s downhill skiing competition that what really fascinated me was more of a psychometric consideration.  Let me explain.

One way to view an event like downhill skiing is to compare athletes and try to determine who is a more proficient skier.  In fact, this is the intention of the event and it is how most people view the competition.  However, I viewed the competition through a different lens.  I mean, is it really fair to say that one skier is ‘better’ than another given that the top five or so skiers were separated by about a half of a second on a course that must be about 3 miles long?  Literally, their performance was so similar that the difference in their finishing times was about as long as it takes you to blink your eyelids. 

With such fine-grained differences in performance, I turn my attention to the test.  I asked myself the question, “Is the test of performance that these world class athletes are taking reliable enough to differentiate the top athlete from the second best athlete?”  Essentially, for this to be the case the test would have to have almost no measurement error whatsoever – there could be no sources of unreliability for a test to measure to the hundredth of a second across such a long distance.  Does the downhill track have the characteristics of a ”super-reliable” test able to discriminate at this level?  In other words, I asked myself the question whether or not the downhill skiing event told me ANYTHING about the relative proficiency of one Olympic athlete over another.

Sadly, the answer to that question is no.

In order for a test to measure to this level of precision, there can be no sources of non-performance related variance across participants.  I realized very early in the Women’s downhill that this was not the case.  The first athlete down the hill put up a great time.  She skied flawlessly.  The slopes were fresh.  Then throughout the competition, the conditions shifted.  Skiers fell and put ruts in the slope, wind conditions changed throughout, temperature was shifting and changing the characteristics of the snow.  Even the announcers alluded to the fact that the people skiing later in the competition were not skiing the same course.  In the end, I just could not conclude that the test was reliable enough to infer that the variance in performance was due to the skill of the athletes as opposed to the variance in the test from participant to participant.  While the woman that won gold is a much better skier than me and anyone who will ever read this blog, there is not sufficient evidence to suggest that she is a better skier than the woman who won silver or the two women who tied for bronze.  The test just simply is not reliable enough for such an inference.

Bringing this line of thinking back to a business application, I think about how we interview job candidates.  Just like the downhill race course is a test of skiing performance, a job interview is a test of a candidate’s fitness for a job.  Thinking about the unreliability in the downhill race course, what can we learn about interviewing?  The goal of the downhill is to determine the best skier; the goal of the interview is to determine the best candidate.  The downhill shows us very clearly that if we change the nature of the test, it impacts the performance of the participant.  If we held the downhill on ten consecutive days, it is not inconceivable that ten different people would win it.  How well are your interviews measuring? Can you say that you are not adding ruts to the course? Are you changing the wind conditions? It is important to stay diligently focused on making sure every job candidate races exactly the same race course.  Asking different questions to different candidates and creating a different experience for different people impacts the decisions that you will make about the relative fit of different candidates. This will impact your employment decisions in a very real way.

Your goal as an interviewer should be to be more like a speed skating venue.  The conditions of the track are exactly the same for every participant.  Variance in this event is much more attributable to the athlete (sorry Shani Davis).  As you interview candidates take steps to make sure your course is exactly the same for everyone. 

For more details on how to be a more consistent interviewer, download our eBook:


Interviewer Tips

Ted Kinney, PhD Ted Kinney, PhD is the VP of Research and Development for PSI. An Industrial/Organizational psychologist, Dr. Kinney leads a team of selection experts and developers in the creation and on-going research into the most efficient and effective selection methodologies and tools. He is a trusted advisor to many international companies across all industries. He has particular expertise in behavioral interviewing, turnover reduction, effective selection strategy, and executive assessment.