Cornell Bowers College of Computing and Information Science
photo of keyboard with internet search words

Story

CS PhD Student First to Use Web-based Data to Study Health in Africa

In the first study of its kind, researchers have collected and analyzed health-related internet search terms from all 54 countries in Africa, finding that searches such as “Does garlic cure AIDS?” can reveal pockets of disease prevalence, cultural stigmas and urgent needs for accurate health information.

Among other implications, the research identified ways to refine in-person surveys to gain a better understanding of health needs, and ways to augment health websites to fight the spread of misconceptions.

“Shedding light on the varied and complex health information needs of individuals in Africa is the big thing we were able to contribute,” said Rediet Abebe, a doctoral student in computer science and first author of Using Search Queries to Understand Health Information Needs in Africa,” which will be presented at the Association for the Advancement of Artificial Intelligence Conference on Web and Social Media, June 11-14 in Munich, Germany.

Abebe is now working with organizations and the health ministries of some African nations to share her findings.

Collecting online data from the African continent, which is far less studied than Europe and North America, can help address a gap in understanding its people’s needs. For Abebe – who is from Ethiopia and is an organizer of Mechanism Design for Social Good, a research group dedicated to using algorithmic and computational approaches to improve access to opportunity – narrowing this gap is a critical goal.

“A lot of communities that are underrepresented or marginalized are missing in our data sets,” she said. “The problem here is that if you don’t have a lot of ethically collected, high-quality comprehensive data, it makes it very difficult to identify the specific needs of communities, and to create interventions or targeted educational programming that can address those needs.”

To generate the data, the researchers obtained all Bing search queries containing the words malaria, HIV or AIDS, and tuberculosis or TB from all 54 countries in Africa between January 2016 and June 2017. They anonymized the data to protect privacy and extracted the most common topics, which they then analyzed by region and user demographics.

They found several topics they expected, such as HIV symptoms, antiretroviral drugs and breastfeeding. But they also found subjects they did not expect, such as searches about whether moringa seeds, blackseed oil or coconut can cure AIDS. They also found relatively high numbers of searches related to AIDS’ stigma, like workplace protections and ethical quandaries, in regions with a high prevalence of the disease.

“We know that when a disease is stigmatized it might lead to more risky behavior and lower rates of testing,” Abebe said. Identifying areas where stigmas were common could help better target education or other interventions, she said.

The researchers also examined how users’ information needs were being met online. When people type in search terms relating to natural remedies, they’re far more likely to get recommendations for blogs or untrustworthy websites than high-quality sources, such as the National Institutes of Health or the Centers for Disease Control and Prevention, Abebe said.

This may be because of bias from search engines, or because those official sites don’t mention natural cures. Adding a page to high-quality websites listing common misconceptions, including evidence of why these natural remedies are ineffective, could help steer people to better sources, Abebe said.

“The information that people are getting online related to HIV/AIDS, malaria and tuberculosis varies significantly,” she said, “and in particular there is the potential for misinformation to spread.”

Internet users comprise less than half of Africa’s population. Still, online data could supplement door-to-door surveys, which are costly and usually represent only a small slice of the population, and government studies, which can be overly broad.

The online data also provides information – particularly about stigmatized topics – that could help the people conducting surveys elicit more accurate replies.

“It’s very difficult to go up to someone and say, ‘Can you tell me all the misconceptions you have about this disease?’ They would say they have none,” Abebe said. “If you already have a sense of some of the misconceptions that exist, that might not be information you had before, so you can use that as a cheat sheet to complement your surveys.”

The paper was co-authored with Shawndra Hill and Jennifer Wortman Vaughan of Microsoft Research, H. Andrew Schwartz of Stony Brook University and Rockefeller Foundation fellow Peter M. Small. The research was partly supported by Facebook, Google and the MacArthur Foundation.