A Guide for Social Impact Professionals
You’ve probably heard it before—we’re living in the golden age of data. But what does that mean?
It means that there’s more data available now than ever before. The digital world has created opportunities to track metrics and behaviors at a mind-boggling scale. If you’re overwhelmed by numbers that reach into the quintillions, think about it this way:
At the beginning of 2020 the number of bytes in the digital universe outnumbered the stars in the observable universe by about 40 fold.
Data is being leveraged across every sector—from fraud prevention in banking to predictive analytics in public health to even creating individualized education plans for students. And the list goes on and on.
One sector has fallen behind: philanthropy. Too often the focus on short-term impact has deterred organizations from investing in technology or infrastructure that could accelerate progress. But social impact organizations cannot afford to ignore data any longer—it holds too much potential.
Of course, data doesn’t mean much on its own. It’s only a piece of the puzzle in how your organization makes meaningful change. But if you aren’t thinking strategically about how you collect, manage, and analyze data, it’s time to start.
This guide will help you map out a data strategy that fits your organization’s capacity and your mission.
Let’s dig in.
Asking “What is data science?” is a little like asking “What is social impact?” It’s a broad term, but at its heart, data science is the practice of taking raw data and extracting insights using a variety of techniques.
You don’t have to know a lot of technical jargon, but here are a few terms that will help you get oriented in the conversation.
Data analytics is the process of analyzing raw data to find trends and answer questions.
Machine learning (ML) allows computer programs to learn from data.
Artificial intelligence (AI) is the ability of a computer to do tasks that are usually done by humans because they require cognitive ability, memory, learning, or decision making.
Data science (when not referring to the field as a whole) is the process of using machine learning and data analytics to find hidden patterns and predict future results.
Natural language processing (NLS) is the branch of artificial intelligence focused on giving computers the ability to understand and respond to text and spoken words.
1
Too often in the world of social impact grantmakers, nonprofits, and researchers use different terminology to refer to the same things. Naming conventions may not sound like a big deal, but when it comes to labeling and tracking data, they can be a huge roadblock.
Creating a data strategy forces organizations to be intentional about how they standardize their naming conventions. Teams cannot perform meaningful data analysis without establishing what data they’re using and how they’ll refer to it. This standardization makes it much easier to see where impacts overlap and intersect.
As an example, imagine three organizations dedicated to reducing hunger. One measures their impact by “clients served,” another measures “meals provided,” and the third measures “food packed by the pound.” Though all three missions align, when they look to compare their data, they will likely have trouble because their definitions of “impact” differ.
Through analysis of existing measures and terms, data science can help the social impact sector create clearer standards. This will enable more collaboration and deeper understanding of impact across organizations.
Credit: Impact Genome Project
Right now, the Impact Genome Project from Mission Measurement is working to standardize social impact data across the industry. They have created an Impact Registry™, a searchable database with impact and beneficiary data on more than 2 million nonprofit programs in the U.S. and Canada. This sort of tool can help the philanthropic sector build toward more effective data-driven solutions by identifying and standardizing the core components of measurement and reporting.
2
It’s no secret that the social impact sector has work to do when it comes to reducing bias and advancing equity. But the first step is being able to see the problem clearly.
When it comes to addressing bias, data is power. One powerful example is the report on funding for Native American communities and causes. By gathering and analyzing data over a 15-year time period, the report authors were able to spotlight an incredible discrepancy in foundation giving:
From 2002 to 2016, only .4% of philanthropic dollars explicitly benefitted Indigenous communities (though Indigenous people account for 2% of the U.S. population)
This one data point has helped reshape how the philanthropic sector thinks about supporting Indigenous communities. It’s helped spur new investments and plans of action. Plus it gives advocates a metric to point to as they call for change.
For organizations dedicated to equity, data is an essential component. When teams can objectively identify bias, they have a much better chance of creating lasting change.
3
Many nonprofit organizations are working to help solve big, complex problems. From climate change to food insecurity to education access to racial justice, many of these issues intersect in both expected and unexpected ways. Data science can help illuminate opportunities to connect efforts across programs, organizations, or even causes.
These relationships are particularly important for organizations applying an intersectional lens to their work. No issue exists in a vacuum. Vulnerable populations often experience multiple layers of injustice at once. Effective solutions embrace this complexity.
Making these connections takes work, but data science can help. For example, Wellcome Trust, which focuses much of its funding on health, did a topic modeling project with their neuroscience team to extract data-driven topics for research funding and visualize the connections or gaps between programs.
This map helped illuminate the gaps and connections between project topics. Credit: Wellcome Trust
The Wellcome Trust team was able to use the map to understand existing connections and identify opportunities for growth.
Though organization leaders and employees may think they have a good understanding of how their work fits together, using data visualization tools can help spotlight unexpected overlaps or gaps, which is incredibly useful for planning future programs.
4
If you’re working in social impact, one big question that likely drives your work is: how are you moving the needle?
You want to know that the work you do is making a meaningful difference. Data science and analytics can help you more accurately track the outputs and outcomes of your efforts. The last thing you want is to keep investing in solutions if they aren’t making a real impact.
But measuring impact is not a simple yes or no question—are we making a real impact? It’s often much more complex. A strong data strategy can help you parse out the details. Perhaps some of your programs are more effective than others. Or maybe they’re more effective for certain populations.
Datakind, a nonprofit dedicated to harnessing the power of data science and AI, helped Llamau, a homeless charity in Wales, understand their impact. The Llamau team wanted to know who was benefiting from their services most, which programs had the most success, and what kind of improvements could be made.
Through this analysis, the Llamau team found that people who were male, youth offenders, and had previously been in the care system had the least successful outcomes. With that information, the team began working to tailor programming and training to better serve this population’s needs.
Getting a granular understanding of how programs do or do not serve community members will help organizations invest resources in the right initiatives. Plus it will give them the feedback they need to retool what isn’t working.
5
No matter what your organization’s mission is, there are likely others out there working toward the same vision. Of course you likely know how your mission fits into a larger conversation, but what about your impact? A good data strategy can help you understand how the outcomes of your work fit into broader efforts to make change.
A good data strategy can help you understand how the outcomes of your work fit into broader efforts to make change.
Part of this story is understanding how your work intersects or overlaps with other nonprofits, but also public sector agencies, research institutions, and private corporations.
Wellcome Trust used a combination of API-search, web scraping, and natural language processing to assess how their work fit into broader efforts to combat the COVID-19 pandemic. To assess the effectiveness of the research they funded, they measured how often the research was mentioned in scientific evidence from government agencies and acknowledged in general literature around COVID-19. This gave them a clear sense whether the programs they supported were playing a part in the wider efforts.
Taking the time to recognize how your organization's impacts fits into a wider landscape can prevent you from duplicating existing efforts and can inform how you engage with others doing similar work.
6
For organizations of any size, the balance between program cost and effectiveness is always a consideration. You want to understand how the resources you invest create positive outcomes for the community.
Employing data science can help you get an accurate picture of the relationship between cost and effectiveness. This is essential information to help you map out your future strategy and create programs that maximize your investments.
Innovations for Poverty Action (IPA) used data science to evaluate the effectiveness of a deworming treatment in improving health and school attendance worldwide. This relatively inexpensive intervention showed incredible outcomes, both in the long and short term. It was remarkably successful in decreasing school absences, and it improved health for treated and untreated children.
Having clear data to connect program costs to outcomes will help organizations of any size be strategic in how they divvy up resources and set priorities.
7
Too often grantmakers hold back progress because they’re stuck in outdated processes. When funders move slowly to review grant applications and award funds, they can hamstring community organizations, who are often in the process of responding to urgent needs. If they have to wait for a long, drawn-out review process so do the individuals they serve.
Data science can smooth this entire process out and enable everyone to move faster. Submittable’s Accelerated Review uses AI and machine learning to help grantmakers streamline their review process. Accelerated Review allows you to build a model based on reviews performed by your team. In short, you’re training the computer to help review applications the same way your team does.
What does the process look like?
Your team reviews a sample of about 200 applications.
Submittable builds a model around that data, learning how your reviewers score and rate applicants.
Before applying the model in full, it’s evaluated for accuracy on a small batch of applications. If necessary, the model can be retrained on more reviews.
Automating your review process this way means that you can review thousands or even millions of applications at an incredible speed and with amazing accuracy. This greatly reduces fatigue and human error. Simply put, automated review enables your team to do more work with fewer staff by leveraging machine learning to help highlight key information.
No matter your organization size or mission, you need a data strategy. We’ve laid out nine steps to help you build the right approach for you and your team.
1
Data can be an incredibly useful tool, but it doesn’t mean much if you don’t use it. Before you create a plan for how you will collect and analyze data, you want to know what you’re working toward.
Are you trying to:
Understand your impact more deeply?
Measure how you deliver on your promises?
Compare outcomes across programs?
Ensure equity?
Predict future need?
Assess your marketing strategies?
Retool your approach to fundraising?
These are just a few ideas. The key is to sit down with your team to decide what you hope to achieve. Be as specific as possible.
For instance, if you want to use data to ensure equity, clarify what “equity” means. Are you looking to address inequity across race, gender, class, sexual orientation, geography, or another classifiction? Drill down into your goals to crystalize your focus and get everyone on the same page. This will determine next steps.
2
Understanding your capacity is essential to creating a data strategy that you can actually stick to. You don’t want to set goals that are wildly unrealistic for the resources you have. But you also don’t want to assume that limited resources mean you can’t leverage insights from data.
If you don’t have a data scientist on staff, you can consider other options. For instance, can your existing team accomplish the goals you set? Would they need additional training or tools?
Another option is to hire contractors to build the framework or model that your team can manage in the long term. Or, if your team doesn’t have the bandwidth to take on some of this work, you could have a contractor do a one-time assessment, which will provide a snapshot to help guide your program strategy.
Maybe you realize that your lofty goals don’t quite match your current resources. This is important to know. Not only will it help you scale your expectations in the near term, it will also encourage you to think about what it will take to build a more robust data strategy for the future.
3
Building the right structure is probably the most important thing you can do to be successful with data. No matter what your goals or capacity, your ability to access and utilize your data hinges on the structureyou put in place around it.
Data scientist Andrew Spott puts it this way when he describes building structure: “If you start here, everything else is easier. If you fall short here, everything else is hard.”
But what does this look like in action?
If you start here, everything else is easier. If you fall short here, everything else is hard.
Before you start collecting data, you want to be intentional about the way you set up application forms and questions. This is really where the rubber meets the road in terms of setting your team up for success.
Clarify definitions and create consistent labels and fields so that you can easily compare data across projects and programs. For example, if the label “name” refers to an individual’s name on one form and an organization name on another, when you try to compare that data, you’ll run into trouble. This is how you create those clear standards and taxonomies mentioned earlier.
You’ll also want to think about the questions themselves. Multiple choice questions will provide data that’s much easier to work with compared to short- or long-answer. Though you may lose some richness when you eliminate narrative-style questions, if you don’t have the capacity for natural language processing (NLP), you won’t be able to derive bigger insights from them.
If you choose to use multiple choice, try to be as thorough as possible in your list of options. Creating an “other” field can seem like a great catchall, but if you rely on it too much, it may be a roadblock later on when you try to interpret that data.
As you think about building the right structure to support your goals you want to think about how you will store and manage the data you collect. Does your grant management software have the capabilities you need? Can you collect, store, and analyze data across programs?
Centralizing your data as much as possible will help you turn information into actionable insights. Submittable’s Data Sharing allows organizations to access their Submittable data in Snowflake. This means teams can access a single source of truth and pull their data into business intelligence tools for comprehensive analysis.
Whatever tools you use, make sure they can work together. You don’t want to end up with data spread out across too many applications or spreadsheets.
You also need to think about who on your team will have access to and manage the data. Creating consistency here is essential. Everyone involved should be trained on how to generate labels and forms that keep the data clean. Clarify definitions and roles, and be sure to touch base often.
4
As you shape your strategy and structure you want to think back to your goals. Ask yourself, who is this data for? Remember, you’re not collecting data just for the sake of collecting data. You want to create a framework that allows your team to draw actionable insights from the information. In short, data is a means, not an end.
Though your goals should drive your decisions, you want to build with the future in mind. Ask yourself what you might want to better understand in the future. Because if you don’t collect some of that data now, you’ll be starting from scratch later on.
Let’s say down the road you decide you want to segment your funding based on gender. If you haven’t been collecting that data, you won’t have the capability to understand your funding through that lens. Remember: identifying gaps or biases often hinges on having the right data, so be intentional and thorough in how you decide what data to collect.
Err on the side of collecting more data than you need right now. But always be mindful not to place too much of a burden on applicants. Make it as easy and streamlined as possible for them to provide the information you need.
5
Some of your goals might require you to reach outside your organization's data set. This can help you understand wider trends, plus it can give you data to compare against. Outside data can be key in placing your work into a broader context.
If you collect data on racial demographics from your grantees, you might want to compare that to the overall makeup of the community you serve to see if there are any discrepancies in who applies for and receives funding.
Luckily there is plenty of open-source data, which is free and open to the public. For example, you can easily access U.S. Census Data or datasets from the World Health Organization (WHO).
But it’s not just government agencies, some large foundations share their datasets as well. For instance, the Annie E. Casey Foundation maintains the Kids Count Data Center, which tracks key indicators around child well-being.
Private companies are getting in on this too. Through big data philanthropy corporations are providing nonprofits access to their incredibly robust data capabilities to help further social good. Mastercard’s Center for Inclusive Growth is an example of a company using their existing data capabilities to derive insights and make a meaningful impact.
Decide what kind of outside data would help guide your work and see if it’s already available. You’d be surprised how much data you can find once you start looking.
6
Many social impact organizations are collecting sensitive and personally-identifying information. This makes it all the more important to prioritize privacy and security right from the start.
Sometimes smaller nonprofits overlook this part of their data strategy because they believe that their mission or capacity exempts them from compliance. Not only is this untrue, but it can undercut your overall objectives. If your lax data security puts vulnerable populations more at risk, you’re likely acting at odds with your mission. Protecting this data is another way to follow through for the communities you serve.
If your lax data security puts vulnerable populations more at risk, you’re likely acting at odds with your mission.
The first step is knowing your legal responsibilities for encrypting and securing this data. Here are some rough guidelines about when to pay attention to which laws:
If you collect health data, you need to adhere to HIPAA
If you collect data from individuals in the EU, you need to adhere to GDPR
If you collect data from individuals in California, you need to adhere to CCPA
In general, even in situations where you are not legally bound to, it’s a good practice to encrypt and secure the data you collect.
Don’t feel like you need to reinvent the wheel as you build your cybersecurity strategy. Leverage tools from other organizations who know the processes well such as these guides from NTEN and the National Council of Nonprofits.
When it comes to data, you don’t need to approach it with a scarcity mindset, but you do need to be mindful of how you share and use it. Data justice should be a part of your plan, Communities should never feel like data is being extracted from them without permission or used to discount their voices.
Be clear from the start with community members how you intend to use the data you collect and whether they will have access to it. If you do intend to share data more broadly, be open about this and be sure to anonymize it, which can require more than just removing personally-identifying information.
7
When it comes to leveraging the data you collect, it can be helpful to start with data analytics before you dive into more complex data science.
What’s the distinction?
Data analytics is more about drawing conclusions from data, uncovering trends, and creating data visualizations that can help you make more strategic decisions. You might try to analyze how you’ve allocated your funding across geographical locations or whether certain populations are engaging more with your programs.
You might be able to create some simple visualizations on your own. Or you may think about enlisting an experienced analyst to help you create more complex visualizations. On the other hand, if you want to predict future needs, understand natural language, or discover hidden relationships, you likely need data science (and a dedicated data scientist to help).
8
The act of measuring something can distort the very thing being measured. Goodhart’s Law explains the distortion this way: when a measure becomes a target, it ceases to be a good measure.
When a measure becomes a target, it ceases to be a good measure.
Let’s look at a simple example of how Goodhart’s Law works. Imagine a hospital where doctors are rated based on how many patients they save. In theory, this might make sense. But if the doctors know that this is how they’re being measured, they might try to avoid treating patients with lower chances of survival. Suddenly the metric is not as meaningful. You’re no longer measuring how often doctors are successfully treating a patient, you’re tracking how well they can avoid seeing the sickest people.
In the social impact sector, if you build your data strategy around specific metrics and you make those metrics public, you’ll no longer be measuring those metrics. You’ll be measuring an organization’s ability to meet those targets.
Although transparency is important, you want to be mindful about how you talk about the data you aim to collect. The last thing you want is to disincentivize organizations from serving the most vulnerable populations because they are worried about how their outcomes will be measured.
9
You’ve collected and labeled the data. You’ve leveraged outside data sources, identified trends, created visualizations, and maybe even used complex data science. Now it’s time to put the insights you’ve gained into action.
Sit down with your team to go over your findings and identify the big takeaways. Does the data help you see the areas where you can expand or shift priorities? Hopefully, you’ve found some clarity around the questions you sought to answer.
As you dig in, be prepared for unexpected or even disheartening insights. Sometimes you find that the interventions you thought would have a lasting effect are not working as well as you hoped. This is great information to have. It gives you the chance to build on what’s working well and to re-strategize what’s not.
Organizations large and small sometimes have to confront hard truths. A recent evaluation of the Bill & Melinda Gates Foundation’s Alliance for a Green Revolution in Africa (AGRA), for example, found that the effort hasn’t delivered on its promises.
When it comes to making adjustments, remember it’s easier to make multiple small course corrections than it is to make large pivots. Look at your data early and often to make sure you are on the right track. For example, AGRA went 15 years before its first macro level performance review. If the team had done smaller reviews earlier on, they might have been able to course correct sooner.
Across the social impact sector, people are talking about the importance of making data-driven decisions. And it’s true—to build effective solutions, we need to start with the data. But without a sound strategy to collect, secure and analyze it, the data won’t mean much.
Whether you’re part of a small nonprofit, a large foundation, or somewhere in between, you can’t afford to ignore data any longer. If you haven’t already, it’s time to get started.
No matter where you are in your data journey, Submittable can help you and your team tap into the world of data science.
How do you ensure that the technology you use advances equity and supports collaboration? Join technologists Amy Sample Ward and Afua Bruce, co-authors of the new book The Tech That Comes Next, as they explore where tech and social good intersect.
How do we shift from conversations about equity into action? Join Grace Moss, Sophia Zeinu, and Yvette Urbina—three leaders from WarnerMedia—as they share strategies for operationalizing change.
Learn 8 best practices that outline how to measure social impact.