by Claire Bénard, Interim Data and Analytics Director at Impact on Urban Health
Data in the charity sector is often used for monitoring and evaluation. It’s used to measure whether we’re getting the right results. But data isn’t just a tool for monitoring and evaluation. It’s also a tool for social change.
Those who know me will know I have been passionate about this topic for some time. First, I should clarify that promoting other use cases for data is not about denying the critical importance of monitoring and evaluating. Data has a great role to play in measuring what works and what doesn’t to ensure that organisations scale up the right interventions. I am advocating for an AND, not an OR.
I have been lucky enough to come across inspiring examples of using data for social change. Volunteering for DataKind UK gave me the opportunity to witness (and sometimes be part of) impactful projects. I was also a fellow of the uptake.org data programme. Spending six months with innovative data practitioners expanded my horizons on the possibilities of using data for different use cases.
In this blog, I will take a deep dive into two examples that I think can inspire others to use data and Machine Learning (ML) for social change. Being realistic, I’ll also talk about common barriers I’ve witnessed or experienced, echoing Dulcie Vousden’s great blog.
Wefarm is a start-up that aims to connect 100 million small-scale farmers to the knowledge, inputs, and output markets they need.
Using ML to power knowledge sharing with SMS
When I joined in 2018, Wefarm operated as a free peer-to-peer Q&A system on SMS that supported four languages. A farmer could send an SMS to a short code and our system would analyse the message and send it to other users of the platform with the knowledge to answer it.
Machine Learning was used to:
- Detect the language of the message
- Classify the type of message (whether it was a question or a response)
- Check if it was spam
- Recommend members in the community that would be likely to answer
Using ML to build safer online communities
With its exponential growth, the platform became attractive for bad actors. While our two million legitimate members were getting answers to their questions, a small number of malicious users were spreading scams and abusing users.
With millions of messages entering the platform each month, manual monitoring was not an option and bad actors had learned to get around our rule-based alert system. We had gathered labelled data from manually reviewing some suspicious profiles. This gave us the opportunity to train a Machine Learning model to identify malicious users.
There were quite big limitations in the data, which meant the precision of the model was low (about 12%). In other words, when the model flagged 100 users as potentially malicious, only 12 were. We therefore designed a process with a human in the loop. While this was not as efficient as an automated block, it was still about 400 times more efficient than our previous system.
ML made our bad actor detection more efficient, saved us money that was not spent on promoting scams on our network, and more importantly contributed to building a trusted community where our members could thrive.
Impact on Urban Health
At Impact on Urban Health, we believe that we can remove obstacles to good health by making urban areas like inner-city London healthier places for everyone to live. By combining a varied range of data sources, robust evidence, lived experience, and practical interventions we seek to understand the causes of complex health issues, and we partner with other organisations to address them.
Using open data to identify vulnerability
One of our programmes tackles the adverse health effects of air pollution. We worked with Advance Pro Bono to create a vulnerability score based on the demographic and health characteristics of an area. We know that younger and older people, as well as people suffering from heart and lung conditions, are particularly at risk of suffering from pollution exposure.
Overlaying vulnerability and exposure to target interventions
Areas of high exposure do not necessarily correlate with areas of high vulnerability. There are, however, some areas that score highly on both. Mapping them and drawing the scatterplot of both scores was an effective way to inform our operational decisions.
The full dashboard, with the methodology explanation is available online.
Some barriers to using data for good
Talking about success is rewarding, but we need to remember how challenging it can be to develop data-powered technologies and decision-making tools.
The difficulty of getting senior leadership buy-in
As practitioners, it is our responsibility to ensure that the work we do is adding value to the users’ day-to-day experience. Our job is to engage our stakeholders early, understand their needs, and design products that are friendly to use.
However, your best efforts can be wasted if there is no senior leadership buy-in. Without a senior champion, data projects get under-resourced and deprioritised, making it hard for the data team to thrive. No matter how talented they are.
The jack-of-all-trades data person
In many charities, there is a tendency to hire only one person and expect them to demonstrate the value of a data function.
Having one person in charge of a function means that they have a broad and unclear job description. They are data engineer, database manager, data analyst, data scientist and Business Intelligence analyst all at once, and they are often responsible for all data in the charity, ranging from the website analytics to the service delivery. This, combined with the temptation of hiring a junior member of staff to avoid committing too much, makes for a challenging environment to develop into.
Fighting extreme expectations to find and define a good use-case
Some data projects get discredited by the assumption that they will not bring new insights to decision-making. Surprisingly, I have even seen this happen in fundraising, which you would expect to be a rather data-friendly space.
On the other hand, some data projects are set up to fail because of improbably high expectations from senior leaders. They are bound to disappoint if you expect them to revolutionise your service delivery.
Most of the time, reality is somewhere in between. Having an informed discussion about what is achievable helps dramatically with scoping useful projects. The example I gave of the bad actors detection model is a good illustration. If the expectation had been that ML would offer an automated bot that blocked all offensive users, 12% precision was a catastrophic failure. If we had decided that because our data wasn’t perfect it was useless, we would have missed out on a 400x improvement in efficiency. Collaborating on the design of the feature and refining the scope as we learned was key to a successful implementation.
We are biased. Bias is not just in the data but also in the way it is used, analysed, and visualised. When working with vulnerable populations, the impact of biased decision-making can be devastating. This has led some organisations to steer clear of new technologies altogether.
While this is wise in some cases, there is also a cost to not making the most of data for social good. At Impact on Urban Health, we want to be at the cutting-edge of data practices and we consider that making responsible use of data is a condition to successfully innovate.
Practically, we are working towards upskilling our analysts, creating processes to ensure we are having the right conversations about the limitations of our work, and seeking external advice from a diverse group of people. This is a journey and we look forward to sharing our learnings with the broader sector.
To join the Data4Good Festival mailing list and get updates about more blogs and news, sign up here.