Navigating the Road to High-Quality Data: Validate

October 25, 2017
Share this blog post:

Now in our final installment of the ‘evaluating data’ portion of the Navigating the Road to High-Quality Data blog series, we’re ready for a road trip.

Let’s pause to celebrate all effort you’ve invested in a quality vehicle along the way:

Before you set the cruise control and start soaking in the scenery, prepare for one last ‘diagnostic test’ to ensure a smooth ride, assessing and verifying screening, data quality checks, and survey structure are performing as expected through a review of the data. Why? Because although research is science-based, our participants have an element that makes them unpredictable: they are human.

When first launching a survey, we recommend a soft launch for a readable base size (minimum of 50 in most cases) to check the data. Keep in mind, base sizes in soft launch are small, often with a less balanced composition than a full dataset, which means findings should be viewed as an indicator light to prompt further investigation. Here are some areas to assess:

  • Scan the results. Review reported market share, vendor share, and known benchmarks around revenue or spend you would expect to see. Keep in mind that just because a hypothesis and survey results are incongruent doesn’t mean an issue has occurred in the research. Consider a few questions to assess:
    • Could the market have shifted?
    • Was this sample frame different from your secondary data?
    • If the answer to both is “no,” revisit screening to ensure accuracy.
  • Review survey length. Look for extremes in the survey data and any correlation to oddities. For example, does the market share of a key vendor look low, but only among business of a certain size, or people who influence decisions instead of making them? LOIs above 30 minutes, the tipping point for lower participant engagement, should prompt a review for unnecessary content.
  • Assess attention checks. While estimates on a “standard” rate of inattention may vary, several sources indicate that rate sits around Research Now’s average of 4-5%. In a 2005 study, J.A. Johnson cites a rate 3.5% for inattentive responses, and Researcher Steve Gittleman Ph.D., President and CEO of MKTG., Inc. suggests 3-5% is the natural level of inattention. While highly technical or complex surveys may see higher attention check fails, instances where a majority of participants are failing a check may indicate an issue with the check itself.
  • Audit knowledge checks. If a knowledge check has a high failure rate, consider reviewing the question. Alternatively, if few people failed a knowledge check, but open ends imply confusion, consider adding a new check aimed at the issue.

Verify technical components. Do participants see the questions they should see, and skip the questions they should not? Verify that survey participants are able to engage with the survey as intended and then measure their performance within the survey using the stats on the survey itself – called paradata – to determine if a suspected issue stems from participant or survey quality. Paradata includes data about the survey instrument, such as time stamps, keystroke data, and abandon rate and is helpful in revealing an issue with the question design.

Validating and cleaning data requires balance

Over-cleaning, or hyper cleaning, occurs when single behaviors are used to delete cases from the dataset, resulting in homogenous dataset – one that is likely to more homogenous than the market the survey is trying to understand. In market research, apply Voltaire’s advice: “don’t let the perfect be the enemy of the good.”

Conversely, under-cleaning – or not cleaning at all – a dataset means some outliers may be included that can warp data. For example, share of wallet can be impacted by one ridiculous input from a participant who enters an unreasonable cap. Finding this in the soft launch enables you to adjust programming prior to collecting more data.

When I was learning to drive (an undisclosed number of years ago), one of the key driver’s education lessons was combating icy patches that may take your car off the path. The instructor asked students to overcorrect – hard – to the right, and then fix the vehicle. One over-correction to the right, counter-correction to the left. Break. (Panic). It was a scary session, but it taught an important lesson: overcorrection can be detrimental – and the same is true in data cleaning.

Enjoy the ride

Ultimately, the destination – high-quality data that fuels impactful business decisions – can’t be reached without an accurate voice from the audience the decision is about. Just as you would strive to embark on a road trip with a vehicle in top condition, the research design and results must be equally well-maintained. Once everything checks out, you’re ready. Get your coffee, gas up, make the final adjustments on your dashboard. You have a destination to get to!

This blog is the final installment of a 4-part blog series by Research Now. To check out parts 12, and 3, click the hyperlinks.

Free Guide: Navigating the Road to High-Quality Data

At Research Now, we strive to follow our Research Quality GPS to ensure our clients, our research team, and our participants all have a smooth journey, collect the highest quality data possible, and avoid any bumps in the road. Navigating the three main points in your research journey – designing, screening, and evaluating data – efficiently equips research projects to obtain the highest quality data. To learn best practices in each of the three data collection areas, download your free guide by clicking below.

Download Guide

Share this blog post: