In the world of research and evaluation, we really just want to cut to the chase and find out how big of an impact our program had. Seriously! We did all this amazing work with our program, we put in the effort required to collect data using carefully thought-out instruments, and heck, we even dutifully entered all that data into the database…why can’t we see the results right away? Well, it’s relatively simple; data do not become actionable information (that is, informative information) without added effort.
It’s a lot like baking a cake. Would you ever wonder where your slice is before the ingredients were combined and it was baked in the oven? Of course not! Similar to cake preparation, there is a whole phase of analytic work (mixing of ingredients) that must take place before the analysis can be done (baking) and the results (cake) are ready for consumption.
A quick side note: as anyone who has been handed a bunch of analytic results can tell you, there is still a lot of work to be done to translate those results into clear and engaging tables, charts, presentations and reports, but that is (figuratively) the icing on the cake and we’ll cover that at another time.
For now, we want to focus on what happens during the mixing. What gets done to data before it’s ready to go into the oven? Here I’ve compiled a fairly comprehensive list of things to look for/do with raw data prior to the actual analyses. It is by no means exhaustive, nor does it apply to every data situation, but it’s a great place to start the discussion. ~Bon Appétit!
Do a quick review to see if the data ‘make sense’, if the data are complete, in the data are what was expected.
Review the analysis plan (the recipe!) and make sure what is planned is feasible with the data you have.
Check the data structure to make sure it is set up correctly for the type of analysis that is planned. Reconfigure if necessary.
Know what the unique ID variable in the dataset is.
Identify and get rid of any duplicate or test observations.
Examine the inclusion/exclusion criteria for the project and make sure all observations in the dataset actually belong there.
Check each variable – frequency if categorical, univariate if continuous – is it complete, is it formattedly correctly.
Check for skip patterns and exclude responses to questions if they should have been skipped over.
Collapse, recode, and create new variables as needed for the final analyses.
The views expressed on the Institute for Community Health blog page are solely those of the blog post author(s), and do not necessarily reflect the views of ICH, the author’s employer or other organizations with which the author is associated.