allblogs

Nov 10 2021

Data Storytelling starts with Data Story Finding

You may have heard a data expert or two talking about data storytelling. But before you can tell a story, you need to find a story. This post walks through some strategies on how to do just that.

In today’s post.

The Graph is NOT the Story.
What is data storytelling?
Being able to find good stories is as important as good storytelling.
Sometimes the story just hits you in the face.
Putting the data into a line graph.
Surrounding the data with context.
Disaggregating interesting data points.
By viewing the chart through a single data point.

Freshspectrum cartoon by Chris Lysy.
"All the data was boring, so I just added some Pusheen cartoons to liven up the presentation."
"This chart is boring so I'll just eat this donut."

The Graph is NOT the Story.

I’ve heard lots of people say that data visualization is storytelling. But I always thought that was a bit disingenuous.

For me, data visualization is not storytelling, it’s story illustrating. The story itself is always much bigger and more meta than the chart or graph could ever hope to become.

It’s why Marvel can make millions upon millions of dollars adapting comic books into movies. Because the super hero stories don’t just make good books, they also make really good blockbuster action movies.

What is data storytelling?

While interpretations vary, most experts describe data storytelling as the ability to convey data not just in numbers or charts, but as a narrative that humans can comprehend.

The next chapter in analytics: data storytelling – MIT Sloan

There are some people in this world who tell fascinating stories.

I had a sociology professor in grad school who told some amazing stories. Talking about interviewing in opium huts or playing underground poker under the watchful eyes of the local police captain. But those great stories came from a life that was rich with experience.

Not all datasets are story rich. And while you might be able to package any data into a narrative format, that won’t make it a good story.

Good stories don’t just exist because someone knew how to tell a story. They just exist, and we need to find them before we can visualize them.

Freshspectrum cartoon by Chris Lysy.
"I'm sorry, but the only stories I get from this infographic are that you really like donut charts and don't understand the data."

Being able to find good stories is as important as good storytelling.

In a lot of ways story finding is really just data analysis.

A good analyst has an ability to find stories in datasets. While they might not be able to package the story, they can often pull up a chart or graph and walk you through what they see.

Finding good stories in datasets is a skill that most graphic designers do not have, because it’s a skill that takes years of practice. It’s the reason that my workshop focuses on helping data people become designers and not the other way around. I find it easier to teach someone who can find data stories how to package them into stories than to show someone who can design well how to find data stories.

But no matter where you fit in that spectrum, here are some strategies for finding the stories in your data.

Sometimes the story just hits you in the face.

Not all data stories require a lot of additional insight to find.

Take this chart from the CDC’s COVID Data Tracker. It shows the different rates of COVID-19 cases by vaccination status. The big story is pretty simple, unvaccinated people are at a greater risk of testing positive for COVID-19 and an even greater risk of dying from COVID-19. And when we see an overall case spike, that difference gets amplified.

Chart showing rates of COVID-19 cases by Vaccination Status from April 4, 2021 to September 4, 2021.
https://covid.cdc.gov/covid-data-tracker/#rates-by-vaccine-status — Chart captured on November 10 from the CDC’s COVID Data Tracker.

Putting the data into a line graph.

Narrative is often defined as a sequence of events. And given that line graphs are really representations of data over time, they make for really solid story telling devices.

You can find stories by putting your data into line graphs. Since the graph walks the data through time, your goal is talk through the parallel narrative. What does a spike in your line graph signify? What about a dip?

Since people are going to read your line graphs from left to right, annotations offer the chance to layout the story point by point.

Infographic created by Chris Lysy using data provided by the St Louis Fed.

Surrounding the data with context.

In research and evaluation we use a lot of descriptive statistics. Means, medians, and standard deviations can be helpful when trying to interpret a dataset. But descriptive stats often take data out of the original context.

One easy way to find stories in data is to add the context back into the picture. Yes, if the average is important visualize the average. But if your dataset is not too large, which includes many research and evaluation datasets, showing all the data gives you more to draw upon.

For instance, it’s one thing to tell the story that your program is performing above average. It’s another story entirely to say that you are performing better than all other programs for a particular indicator.

Oregon Outdoor School evaluation infographic — Infographic created alongside the Oregon Outdoor School evaluation team, this is an example version using fake data.

Disaggregating interesting data points.

If you have a percentage, step back and look at the underlying frequencies. Every percentage started with a numerator and denominator, look at those numbers. Do this even if you have to estimate the numbers based on the percentage.

UNICEF Infographic -
Before COVID, 47% of children lacked access to essential services (education and/or health)
COVID has added 150 million children.
To put that number in context. That's more than the total populations of the United Kingdom, Spain, and Canada combined.
According to an analysis by Save the Children and UNICEF.
For more data visit: data.unicef.org/covid-19-and-children — Infographic created by Chris Lysy based on data provided by UNICE

By viewing the chart through a single data point.

If you are having trouble finding a larger story sometimes it’s helpful to focus on a singular data point. If every point is a person, try to see the data through the person’s eyes. What does the data say about their experiences. Whenever possible this is also a place for exploring supporting qualitative data.

I know I’ve done this in the past but I couldn’t find an example of my own to share. So here is an example from a USAID infographic. The data source for this infographic is certainly not individualized. But the infographic switches the perspective when talking through the data.

Infographic: Learning out of Poverty - Education is foundational to human development and has a clear multiplier effect with benefits in health, broad-based economic growth and poverty reduction.

A child born to an educated mother is more than 2x as likely to survive to age five.
Educated mothers are 50% more likely to immunize their children than mothers without an education.
Every extra year of school increases productivity by 10-30%
A girl who completes basic education is 3x less likely to contract HIV/AIDS
Educated women re-invest 90% of their income in their family. Men invest 30-40%
But still today:

1 in 4 women around the world cannot read this sentence
Girls make up 53% of the children out of school
98% of people who can't read live in developing countries.
Sources: The Global Campaign for Education and RESULTS Educational Fund, Make It Right, Ending the Crisis in Girls' Education 2007 | Literacy Matters Fact Sheet | Van der Graag and Tan, The Benefits of Early Childhood Education Programs: An Economic Analysis, World Bank (1998) | The Global Campaign for Education and RESULTS Educational Fund, Make It Right, Ending the Crisis in Girls' Education 2007 | Sperling, Gene and Barbara Herz, What Works in Girls' Education: Evidence and Policies from the Developing World, Council on Foreign Relations (2004) | The Global Campaign for Education and RESULTS Educational Fund, Make It Right, Ending the Crisis in Girls' Education 2007 | UNESCO. Global Monitoring Report 2011: The hidden crisis: Armed conflict and education. France: UNESCO Publishing, 43. — USAID Learning out of Poverty Infographic

What other story finding approaches have you used in your own work?

If you have an approach I would love to hear it. Just leave me a comment below.

Nov 09 2021

Designing a Prettier and More Effective Dashboard with Excel

Shawna Rohrman, Ph.D., is the Evaluation Manager for the Cuyahoga County Office of Early Childhood and its public-private partnership, Invest in Children. She enrolled in our Dashboard Design course and is sharing how she uses her new skills in real life. Thanks for sharing, Shawna! –Ann

—–

Using a dashboard has been central to my work as a program evaluator.

My office funds several early childhood programs that all differ in their program content, performance indicators, and outcomes.

As the person who reviews each program’s quarterly report showing progress on each of their performance indicators, I am also often asked to report overall performance for our office—for example, total number of families served or number of home visits made.

This can be unwieldy when looking across many reports, and it’s useful to have a document that allows us to assess progress across all the programs at once.

When I enrolled in Ann’s Dashboard Design course, my goal was to build on an existing document, making it easier to read and identify successes and areas for improvement.

From a Basic Many-Paged Table in Word…

Initially, our office used a table in a Word document to track quarterly performance across programs.

It served the basic function of being able to see, in one file, how each program was doing each quarter. But it was lacking in a few areas.

One was that, although the annual targets for indicators were clearly marked in red and there were quarterly totals, there was no annual or year-to-date total to compare to the target.

Additionally, although it was very helpful to have all the performance data in one place, it wasn’t especially easy to see trends from quarter to quarter and the table split across two pages.

Initially, our office used a table in a Word document to track quarterly performance across programs. It served the basic function of being able to see, in one file, how each program was doing each quarter. But it was lacking in a few areas.

…To a One-Page Visual Overview of Key Performance Metrics

The first thing I did to make data tracking easier was move to Excel.

Even before taking Ann’s Dashboard Design course, I knew Excel was the smarter choice just for the ability to use formulas.

I also worked with my colleagues—the main audience of this internal performance-monitoring dashboard—to determine what features would be most useful. We came up with a few that make the dashboard much more user-friendly.

First, we chose a few key indicators to include on a cover page (pictured below). This allowed us to see the most critical data for each program all on one page, rather than having to scroll or flip through several pages.

In this Excel workbook the cover page is followed by separate worksheets, each showing one program’s data on their full list of performance indicators, which is helpful when we are taking a deeper dive into one program’s work.

Second, we all agreed the dashboard needed year-to-date totals to compare with the yearly targets.

This is especially helpful for some indicators, like number of individuals served, where many people continue to participate in a program from quarter to quarter.

Adding up the quarterly number served would count longer-term participants more than once; the unduplicated total is essential for understanding whether the program is meeting its contract target.

I took what I learned in Ann’s Dashboard Design course and added a third feature to visualize progress toward the yearly target: checkboxes and progress bars.

The checkboxes allowed us to see whether, at the end of each quarter, the program was on track to meet the yearly target. So, for example, a program would have to exceed 50% of the performance target at the end of Q2 (halfway through the year) in order to be “on track.”

The progress bar shows exactly what percent of the yearly goal has been achieved year-to-date. I used helper cells outside the print area to determine whether the checkboxes would be filled or empty.

Finally, we found it helpful to use sparklines (another tool learned in Ann’s class!) to succinctly show how performance changed from quarter to quarter.

In 2020, the second quarter was an especially unusual time as programs adjusted to the start of the pandemic. Seeing dips and spikes during that time helped us get a quick sense of what was working and what was not, and we were able to use that information to drill down with program staff.

The Outcome: More Effective Use of Data in Decision-Making

Even with just these few changes (and using a program nearly everyone can access!), our new performance monitoring dashboard has made it so much easier for our team to review quarterly progress in one place and visualize how our system of early childhood programs are working for children and families in the county.

The dashboard has become a quarterly staple at our staff meetings, where we review as a group and use the data to generate next steps.

It is also easy to share with senior leadership, so they can see at-a-glance the important work our programs are doing.

Nov 05 2021

The Data Cleaning Toolbox

The end goal of collecting data is to eventually draw meaningful insights from said data. However, the transition from raw data to meaningful insights is not always a linear path. Real-world data are messy. Often, data will be incomplete, inconsistent, or invalid. Therefore, it is imperative that data be cleaned to correct for these errors, or “realities,” prior to analysis. Otherwise, analyzing messy data will result in incorrect interpretations and unnecessary headaches.

This guide is designed with real-world data in mind. Data are prone to human-error and this guide will help you correct those errors, as well as provide tips on how to minimize these errors in the future. Why is this important? Because data cleaning is time consuming. It is not uncommon to spend 50+% of your analysis time on data cleaning and preparation.

By reducing the amount of time required to clean data, through the methods outlined in this guide, your time can be better spent on analysis and drawing insight from the data.

Identify the Problem

Before data cleaning, it is critical to identify problems within the data set. Sometimes these issues are apparent, such as wonky date formats or missing data. Other times, these issues are more obscure and hidden. This is often the case with open text responses which often include slight spelling errors or extra spaces.

A quick and dirty method to identify some of these issues quickly is to insert the data into an Excel data table [Insert > Table > Select Data Range]. The data table envelopes the full data set and automatically adds filters to each column of data (Note: ensure that you have selected the correct column headings). Now within the data table you can simply click the filter arrow, which provides a full list of all unique values.

Numerical data are not immune to data entry issues either. However, these issues usually result in nonsensical results (e.g., a value of 8 for a question with a scale from 1 to 5) or outliers. It is possible to use the data table approach above to find incorrect numerical data entries too. This can be efficient for Likert scale questions coded as numbers. However, this approach can be laborious when data span larger ranges (e.g., height and weight data). For these data, a quick scatterplot can be used to visualize the data.

With data issues identified, you can begin cleaning the data. The following sections of this guide will address common data issues and how to clean them using Excel.

Common Data Issues

The following section identifies several common issues with data quality. Each issue will be discussed separately with tips on how to identify these issues and, importantly, how to address these issues.

The data issues that will be addressed include:

Missing data
Date data
Inconsistent data
Invalid data
Duplicate data

Missing data

Missing data may be negligible in some instances but have the potential to cause serious issues during the analysis phase. Negligible instances include a few blanks (i.e., literal blank cells) that are not calculated in summary statistics, such as sums and averages. These missing values usually have a minor, if any, impact on analysis. Blank cells are unlikely to cause significant problems. If, however, you have used placeholders (e.g., 0 or 99 are common), these placeholders can mistakenly be used in analysis and significantly change the results.

In the table below, each column has identical data. The only difference is how missing data are treated: blanks, NA, 0 and 99. At first glance, all looks fine. When we begin to analyze the data, we see issues emerge. The subsequent table summarizes each column without accounting for missing data values/codes. We can see there is variation between most approaches.

We can clearly see that numerical placeholders can cause issues in calculating summary statistics. Both NA and blanks resulted in the correct result in this example. However, leaving cells blank may lead to a few questions. Are the cells blank because data are missing? Or are cells blank because of an error in the data entry process?

The best approach for handling missing data is to communicate effective data entry protocols with all data entry personnel. However, this is not always possible before receiving the data. Therefore, you can use the following method to correct your missing data entries.

Method 1: Find & Replace

‘Find & Replace’ can be used to quickly standardize missing values within a spreadsheet. For example, if 99 is used to denote missing data, simply highlight the full spreadsheet, and do a ‘Find & Replace’ (CTRL + H). Simply ‘Find’ the 99 values within the selected range and ‘Replace’ with NA.

Tips for handling missing data

Use “Blanks” or NA as the default cell value for missing data. There are a few options when working with missing data. We suggest that missing values be left “Blank” or that NA be used as a placeholder. The reason we suggest two options is that analysis, especially analysis external to Excel, requires different handling of missing values. For example, R statistical software requires missing values to be coded as NA. But for SPSS, numerical fields cannot have text values. Understand the requirements of any statistical software you may be using outside of Excel, and select the most appropriate option. However, within Excel, using “Blanks” or NA work well in most situations.
Avoid using numerical placeholders where possible. If you have agreed on using a numerical placeholder, the analyst, and anyone working with the data, should be made aware of the fact. The placeholder should also make sense. For example, if you have an age variable, using 99 as a placeholder could cause problems as 99 could be a valid age. In this instance, using a different placeholder would be necessary.
Be consistent. Regardless of how the data are entered, a single, agreed upon format should be the default. If communicated properly, any code value could be used to denote missing data.

Date data

Date formatting can cause major headaches when working with data. Dates can be coded in myriad formats. Further, within Excel, it is not uncommon to have dates formatted as text or numbers. When this is the case, any attempt at sorting the data by date or subsetting by date range becomes infinitely more difficult.

In our work, it is common to receive data where dates have been coded in two or more formats. The more formats, the more difficult the data cleaning process becomes. With small datasets, manually fixing date formatting is an option. However, as the data set becomes larger, this option becomes less desirable. Re-entering hundreds of dates manually is both time consuming and prone to human error.

Shown below, text date formats can be disguised within the data set. These may be difficult to detect when data are extensive. Therefore, there are a few steps that should be taken immediately when dealing with date data.

Highlight the date column and right-click the highlighted area. Select ‘Format Cells…’ and convert all cells to a consistent date format.
- Tip: Select a different format than that currently displayed. If all cells change to the new format, your dates are all in the date format and you can move forward with your analysis.
Convert your data to a data table if you have not done so already. Click the arrow within the date heading to view all dates. If the data are all dates, they will be aggregated into Year and Month sub-categories. Text will display separately.

Following the previous two steps should help you identify if there are any issues in date formatting within your data set. If you find issues, it is time to fix these date inconsistencies. This task can be approached in a few different ways. As mentioned already, with small datasets, manually fixing dates can be used. However, this is rarely feasible, and the following methods will be more pertinent to most real-world data.

Method 1: VALUE function

The VALUE function converts text from a recognized format (e.g., a number or date) into a numeric value. This approach is both fast and effective in dealing with dates that are entered as text. However, text needs to be spelled correctly for the VALUE function to work properly.

Method 2: Find & Replace

Sometimes a simple ‘Find & Replace’ is all that is required to clean date data. This is most effective when dates have a similar structure, but have inconsistencies in the delimiter between month, day, or year values. Simply highlight the column with date data and do a ‘Find & Replace’ (CTRL + H). For example, a period may have been used instead of a comma. ‘Find’ the ‘.’ within the date range and ‘Replace’ with ‘,’.

Tips for handling date data

Be consistent. Regardless of the date format, convert all dates to the same format.
Convert text string data to a date serial number if possible. Text data will cause issues when sorting dates and will error out any formulae using dates.
Communicate to date entry personnel on an agreed upon format for all dates.

Inconsistent data

Inconsistencies in data often result from open text responses. This is where punctuation and spelling play an important role in data entry and analysis. Excel is often quite good at identifying the same word with differing punctuation (e.g., Female, female, FEMALE). However, this is not the case with external statistical software like R. Further, misspellings and differing nomenclature can result in even more issues (e.g., F, Fem, woman).

Inconsistent data can be addressed using one or more of the following methods. For more variable or complex text entries, other software (e.g., OpenRefine) can be leveraged for cleaning the data.

Method 1: PROPER. UPPER, and LOWER Functions

Using the PROPER, UPPER, or LOWER functions can help correct text data that vary in their capitalization or lack thereof. The PROPER function capitalizes the first letter of each word, while UPPER and LOWER functions convert all letters to upper care or lower case, respectively. To use these functions, simply enter the formulae as: “=PROPER(reference cell)”. Replace PROPER with UPPER or LOWER if needed. The reference cell is the cell that you want to correct (the reference cell is B2 for all of Row 2, B3 for all of Row 3, etc.).

Method 2: Find & Replace

Sometimes words are misspelled or entered in a format that does not align with the other data entries. Depending on the amount of variation within the data, a simple ‘Find & Replace’ could work.

Method 3: Sort, Filter, and Correct

While the first two methods will work for most cases, sometimes it is more feasible to go with manual edits. With data in a table format, you can sort the data alphabetically and filter by specific values; this will allow you to target inconsistencies in the data. With the data sorted and filtered, you can easily make manual edits to the inconsistent data entries. Just ensure that if you are making manual edits, that the new entries are accurate.
Note, that with manual edits, there is still an increased probability of human error. The more “human” manipulation of data, the more likely an error could occur that you cannot easily catch (e.g., accidentally copying and pasting “female” over “male” by selecting one too many cells). If possible, limit the amount of manual editing within your Excel worksheets.

Tips for handling inconsistent data

Be consistent. Convert all entries into a single, pre-determined format and be consistent throughout the spreadsheet.
Be careful when using data tables. Data tables do not highlight differences between the different spellings of a word. Excel may handle differences in capitalization well, but external statistical software does not. Convert everything to the same format to improve data quality across all software and platforms.
Communicate data entry requirements to all personnel. Consistently entered data is markedly easier to work with.

Invalid data

Invalid data usually stem from one of two causes: (1) incorrect data entry, or (2) errors in Excel functions. Unless familiar with the acceptable range of values for a given variable or question, invalid data can go undetected. For example, a questionnaire may ask respondents to rank their satisfaction on a scale of 1 and 5. In this case, a value of 0 or 6 would be invalid. However, without prior knowledge and context, these values may go undetected and results from subsequent analyses will be inaccurate.

In the above example, the issue is the result of data entry error. This issue can be addressed in a few different ways depending on if the error is consistent or not.

Errors may be more extreme than the previous example and fall completely outside the acceptable range. For these data, different approaches will need to be implemented to best address the underlying issue. Consistent errors bring into question the validity of the entire data set, while one or two errors can be chalked up to human error.

Method 1: Check Data Ranges

For numerical data, it is easy to check the range of a given data set. Simply highlight the column of the desired variable and a few summary statistics will be provided in the bottom right corner of the Excel worksheet (Note: you may need to right click and customize the status bar). Valuable statistics include the average, minimum value, and maximum value. You can immediately detect abnormalities in the data if the data values are outside the expected range.

Dealing with invalid data often results in assumptions and judgement calls needing to be made. This becomes increasingly difficult with data with numerous invalid data. The following flow chart provides questions that should be asked when evaluating data. When invalid data are encountered, work through the steps and make corrections as needed.

Tips for handling invalid data

Apply a function to bring data into the expected range if appropriate. This assumes data were entered incorrectly on a consistent basis.

Remove invalid data if the invalid responses are few and random.
Question the data validity if invalid data are extensive. Communicate with data entry personnel to determine if there were errors made during data entry.

Duplicate data

Duplicate data result from repeated values. These duplications may result from multiple data pulls of a given database, double entries during the data entry process, or duplicate data submissions. It is important to identify duplicates prior to analysis. Duplicate values, when left unchecked, can skew the results of analysis by inflating data counts and influencing averages and other statistical measures.

The end goal of this section is to eliminate all identical records except for one. In Excel, this process is relatively straightforward.

Method 1: Remove Duplicates

To remove duplicate values, first highlight the range of cells from which you want to remove the duplicates. In the ribbon above the spreadsheet select Data and Remove Duplicates. The Remove Duplicates menu will appear (shown below), and you can select the columns from which you want to remove duplicates.

Tips for handling duplicate data

Identify if the data is at risk of having duplicates. Were the same data pulled multiple times? Have several people worked on the data entry process?
If in doubt, run a quick Remove Duplicates check to determine if there are duplicates in the data. You may want to test this before cleaning the data fully. In the process of the data cleaning process, you may inadvertently code values differently resulting in non-duplicates where there could potentially be a duplicate.

This guide outlines common real-world data issues and approaches for handling these data issues. While I outline the process for dealing with these data issues, it is more time effective to deal with these issues during the data collection phase. Consistency in data entry is crucial for accurate analysis of the data. To move toward consistency, open communication with data entry personnel is key.

Despite best efforts, data will rarely be in a fully clean state. Most errors tend to arise from human error – going slow and checking data consistency and accuracy along the way will drastically reduce headaches down the road. Hopefully this guide will ease the process of getting raw data into a usable state.

Email Address

We respect your privacy.

Thank you!

Sources:

Four Common Data Entry Mistakes (and How to Fix Them)

Three Steps for Painless Data Entry

Preventing Mistakes in Survey Data Entry

Common Issues with Survey Data Entry (and How to Solve Them)

Easy tricks to clean and analyze data in Excel

Cleaning Messy Text Data is a Breeze with OpenRefine

Nov 05 2021

Evaluation Roundup – October 2021

Welcome to our monthly roundup of new and noteworthy evaluation news and resources – here is the latest.

Have something you’d like to see here? Tweet us @EvalAcademy!

New and Noteworthy — Reads

Impact assessment and evaluation tools handbook

LIAISON, an EU-funded research and innovation project, recently funded the development of a handbook that contains tools for evaluation and impact assessment of any initiative involving interactive innovation. They define ‘interactive innovation’ as “the collaboration between various actors to make the best use of complementary types of knowledge (scientific, practical, organizational., etc.) in view of co-creation and diffusion of solutions/opportunities ready to implement in practice.” The handbook contains 37 tools to evaluate and assess the impact of interactive innovation. Each tool is simply explained in a step-by-step format and contains the purpose, background, and logic for the tool.

Evaluation of International Development Interventions

The Independent Evaluation Group of the World Bank has a document titled Evaluation of International Development Interventions: An Overview of Approaches and Methods. The document was produced to support evaluators in broadening their methodological repertoire so evaluators can better match methodologies with evaluation questions – particularly in a world of increasing complexity. The guide provides an overview of evaluation approaches and methods that have been used in international development evaluation. The overview of each approach and methods contains a brief description, variations in the approach, steps in the approach, advantages and disadvantages of the approach, and a list of additional resources related to the approach.

A guide for using administrative data to examine long-term outcomes

The Office of Planning, Research and Evaluation of the U.S. Department of Health and Human Services recently published a guide focused on how to use administrative data as a potentially low-cost way to track long-term effects of policy or program interventions. The guide is intended for evaluation teams, including funders, sponsors and evaluation research partners, and is structured to address three phases of effort: 1) Consider the value and practicality of long-term follow-up, 2) Prepare for long term follow-up by identifying and satisfying necessary legal and human subjects research requirements, and 3) Assess the data to determine if it is suitable for answering the proposed questions.

A guide for developing an RFP for evaluation services

We’ve all come across poor RFPs. When it comes to RFPs for evaluation this is a common occurrence. If you commission evaluations or know someone who does, then take a look at this guide created by Public Profit. The guide outlines key questions that should be addressed in the RFP, documents that should be shared, and logistics to consider through the procurement process.

Network training and toolkit

Converge has a thorough toolkit that includes templates and guides for network leaders. The templates and guides can be used throughout the network design and development process. It includes network charter templates, framing questions, group agreements, an inventory of tech tools, and a lot more!

New and Noteworthy — Courses & Events

EVAL21

Organized by: American Evaluation Association
Date: November 8 – 12, 2021
Type: Virtual Conference

Participatory Evaluation: Community-Based Assessment and Strategic Learning Practices

Organized by: Tamarack Institute
Date(s): November 17, 2021
Type: Virtual Workshop

Applying the “L” in MEL (Measurement, Evaluation & Learning)

Organized by: Clear Horizon Academy
Date: November 26, 2021
Type: Online Course

Evaluation ManagemenT

Organized by: EnCompass Learning Center
Date: December 7, 9, 14, & 16
Type: Online Course

Nov 05 2021

From data to actionable insights

As evaluators, we are rarely organizational decision-makers; it is our job to provide those decision-makers with actionable insights. In this article I highlight how you can translate data into meaningful findings, or insights, so you can support decision-makers to drive action within their organizations.

Asking the right questions

The process of deriving actionable insights from data starts with asking the right evaluation questions. What do your clients need to answer to tell their stories? Evaluation questions are the starting point to any analysis and the answers are the end point. As the anchors of your analysis, it is crucial to dedicate time with your clients to iron out these evaluation questions; they will provide needed context to all results garnered from the analysis.

If you are struggling with writing evaluation questions, we have previously written about how to write evaluation questions (with sample evaluation questions). Refer to these articles for more details on establishing effective evaluation questions.

Start with the data that you have

Make life easier on yourself: start with the data that you have. Data collection takes time. Rather than expending a bunch of resources on data collection, evaluate whether current data sources are sufficient to answer the evaluation questions.

However, there will be instances when the data required to address the evaluation questions do not exist. In this case, you may need to develop data collection tools (e.g., surveys, interviews) to collect relevant data. Keep things simple and focus on the evaluation questions. Anchoring the data and analysis in your evaluation questions maintains focus, limiting the ability for a project’s scope to creep beyond what was originally agreed upon.

Which data collection tool you select will depend on the evaluation question and your evaluation design or approach. However, surveys are usually a quick and cost-effective method for collecting data. For example, your evaluation question may ask: “To what extent do patients have a positive experience with primary care programs and services?” To answer this question, we could design a patient survey. The survey could include questions on satisfaction with specific programs and services or overall satisfaction with primary care. The survey should ask questions that will directly address the overall evaluation question. Fewer direct questions are recommended over many tangential or unrelated questions.

While this is a simplified example, a survey is not limited to answering a single evaluation question. Survey tools can be designed to capture data for one or many evaluation questions. The key is to make sure all questions align with your evaluation questions; this will focus the survey and capture data relevant to the overall goal of your evaluation.

“Garbage in, garbage out”

The results of any data analysis are only as good as the data themselves. If data quality is not ensured, the results of your analysis will be suspect and likely invalid. It is critical that data are scrutinized prior to analysis to establish confidence in the results and insights drawn from the analysis. Therefore, prior to analysis, data quality needs to be evaluated on:

Completeness – are the data sufficiently complete to address your evaluation questions?
Accuracy – do the data correctly reflect the data being collected?
Consistency – do data reflect the same information within and across data sources?
Validity – do the data align with pre-determined conditions/ formats?
Uniqueness – are data represented once within a given data set?

Timeliness – are data up to date to adequately address your evaluation questions?

Likely, data will not meet all dimensions of data quality right away. Some dimensions of data quality can be fixed with simple data cleaning (e.g., correcting minor typos and formatting dates). Other times data points may be excluded from the analysis. However, it is crucial that the data meet all dimensions of data quality prior to analysis to ensure accurate results.

Data to information

Spreadsheets, regardless of their size and complexity, only store data. That is, a spreadsheet does not provide any meaningful information until the data are structured and organized in a meaningful way. Analysis takes the data building blocks and structures them into something useful (i.e., information). This information will, again, be tied back to the evaluation questions outlined prior to the analysis.

Information may be summarized as numbers (e.g., proportions, tables) or images (e.g., charts, infographics). How information is structured and presented are dependent on the context of the evaluation questions asked. The key is to provide information that is simple and easy to interpret.

Sample “information”:

Information should focus on meanings. What do the data illustrate? How does the information connect to the evaluation questions? This is accomplished by focusing the information. That is, focus on one major point per piece of information. By narrowing the focus, you are better able to communicate that information with decision-makers.

Actionable insights

Now that data have been converted into information, it is time to take that information and transform it into actionable insight. Actionable insights come from taking the information gleaned from an analysis and getting at the “so what?”

Getting at the “so what?” is not always easy. But there are a few approaches to move insight to actionable insight, including:

Segmenting (or grouping) the results
Using data visualizations to support the results
Comparing to benchmarks (e.g., time series, norms)
Adding additional context

Segmentation

Segmenting data into discernable groups can help get at the “so what?” Segments, such as demographics, split the results of the analysis into comparable groups. Which segments you investigate are dependent on the evaluation questions asked.

Looking within an organization? Segment by department to derive insight into potential departmental differences.

Looking at financial literacy outcomes? Segment by age or gender to derive insight into potential learner differences.

Segmenting the information derived from your analysis may help identify patterns in the results. Patterns may identify important differences between segments that will allow for the client to better develop an action plan.

Data visualizations

Data visualizations, such as charts and infographics, do not inherently provide actionable insights. However, they can provide additional support for the key findings of an analysis. Effective data visualizations can highlight key messages within the data and help identify areas for action.

Take this result: 80% of patients were satisfied with their last visit.

On its own, we only have one piece of the story. Did the remaining 20% of patient feel neutral about their last visit? Or were they very dissatisfied? For this example, providing a chart with the statement can provide additional context. Knowing that 20% of patients were dissatisfied with their last visit is likely to spur more action than if the patients had neutral feelings about their last visit.

Benchmarks

Further insights may be gleaned from benchmarks. These may be internal (e.g., comparing between time points) or external (e.g., comparing to standard norm). Using benchmarks can get at the “so what?” and provide valuable context to the results of an analysis.

Looking at the previous example, exploring the results over time could provide additional context. For example, if 100% of patients were satisfied with their last visit in 2020 and 80% of patients were satisfied with their last visit in 2021, we can immediately identify a decrease in patient satisfaction. However, if 60% of patients were satisfied with their last visit in 2020, we would likely see a different response from the client. Providing results with the additional context of a benchmark has the potential to turn information into an actionable insight.

Additional context

As evaluators, it is not necessarily within the scope of our role to expand beyond what is provided in the data. Sometimes the data do not fully lend themselves to actionable insights. These cases require additional context beyond the data.

At this point, it is time to hand the results off to your client. Your client will have a better understanding of internal operations, processes, or biases within their organization. Their expertise can provide additional context not apparent from the data alone and the client can come up with their own conclusions based on the results.

The roadmap for transforming data into actionable insights starts and ends with asking the right evaluation questions. These questions guide the entire analysis process, moving data to information and information to actionable insight. The goal is derive meaning from data and answer the “so what?” questions to help organizations target areas for action.

Email Address

We respect your privacy.

Thank you!

Sources:

How to Write Good Evaluation Questions

Evaluation Question Examples

How to Conduct Interviews

Scope Creep: When to Indulge It, and When to Avoid It

Dial Down Your Data

7 Tips for Better Data Visualizations

How (and Whether) to Write Recommendations

What Is Data Quality and Why Is It Important?

allblogs

The Graph is NOT the Story.

What is data storytelling?

Being able to find good stories is as important as good storytelling.

Sometimes the story just hits you in the face.

Putting the data into a line graph.

Surrounding the data with context.

Disaggregating interesting data points.

By viewing the chart through a single data point.

What other story finding approaches have you used in your own work?

From a Basic Many-Paged Table in Word…

…To a One-Page Visual Overview of Key Performance Metrics

The Outcome: More Effective Use of Data in Decision-Making

By reducing the amount of time required to clean data, through the methods outlined in this guide, your time can be better spent on analysis and drawing insight from the data.

Identify the Problem

Common Data Issues

Missing data

Tips for handling missing data

Date data

Tips for handling date data

Inconsistent data

Tips for handling inconsistent data

Invalid data

Tips for handling invalid data

Duplicate data

Tips for handling duplicate data

New and Noteworthy — Reads

Impact assessment and evaluation tools handbook

Evaluation of International Development Interventions

A guide for using administrative data to examine long-term outcomes

A guide for developing an RFP for evaluation services

Network training and toolkit

New and Noteworthy — Courses & Events

Asking the right questions

Start with the data that you have

“Garbage in, garbage out”

Data to information

Actionable insights

Segmentation

Data visualizations

Benchmarks

Additional context

Footer

Follow our Work