JustOneGiantLab: Quantifying communities to analyze a pandemic

Published 22 May 2020 by Cherise Fong

If JOGL’s OpenCovid19 Initiative came out of a crisis, the multidisciplinary collaborations that are forming among its members around a growing number of projects prove that this open online platform is much more than a flash mob. What does the data say?

Marc Santolini, one of the co-founders of Just One Giant Lab (JOGL) with bio-community leader Thomas Landrain and computational biologist Léo Blondel, has spent the past two years at the Center for Research and Interdisciplinarity (CRI) in Paris investigating collaborative learning and solving using network science and data-driven approaches, with the goal of developing tools to foster collective intelligence.

In the case of OpenCovid19 Initiative, he has been leading a Metastudy team to examine how this open community has been self-organizing during the past two months since its launch. At the same time, the team is developing an algorithmic recommendation system to optimize matches between project needs and contributor skills and resources.

“The idea is that behind JOGL there is a network of actors,” Marc explains. “These actors are connected to projects they follow, to people they interact with, to skills, needs and lots of other objects that are on the platform. This network of actors is called a heterogeneous information network. The algorithm predicts potential connections on this heterogeneous network. In other words, we can predict a connection between two people that doesn’t yet exist, but based on the structure of the network, we can say that it should very probably exist. We can also make recommendations based on certain meta-paths within the network that we want to push, for example, matchmaking skills with needs, or between similar projects.”

Visualization of linked JOGL projects in April 2020. © JOGL

Prototyping on the go, the team has already applied this methodology to gain insight from the OpenCovid19 Slack workspace, where the TeamChatViz tool showed that new members were often bottlenecked shortly after onboarding. Now, new members are automatically joined to channels where they are most likely to make connections—with people they might not have met otherwise.

“We’re trying to build an algorithm that avoids echo chambers and maximizes collective intelligence on the platform.”

– Marc Santolini

CRI colleagues Marc Santolini and Bastian Greshake Tzovaras, director of the Open Humans community for self-research in personal health data, were recently awarded a Nesta collective intelligence grant to animate Open Humans communities on patient-led research (people tracking and analyzing their own symptoms, diabetics building their own hardware…). JOGL will help these individuals better organize into more specialized self-research communities by developing a dedicated system of recommendations, and testing their relative impact by comparing with randomized recommendations.

Powered by peer review

Consistent with this lateral approach, JOGL’s micro-grants for OpenCovid19 projects are awarded based on peer reviews by fellow members of the OpenCovid19 community. This participatory process was co-designed by Chris Graham and Elliot Lawton, based on the original architecture of JOGL’s Co-Immune program. The latter was also used for OpenCovid19’s sibling community, Helpful Engineering.

“In academia, grant funding is slow, peer review is slow, and honestly, all science is slow unless you focus on a very small area—and this is all sped up by collaboration and by increasing the number of people to remove the burden on individuals,” says Chris Graham. “At JOGL, the open grant system, and an analysis of the collective thoughts on projects that this brings, has let us assign funding ethically to a list of fantastic projects that both science and community agree with, and it’s already helping us thrive.”

“I believe in open science and the sharing of ideas collectively,” he continues. “We all have a dream that in the future, scientists will be using the Web for collaboration and to achieve their goals more efficiently, breaking down many hierarchical barriers that exist traditionally and getting straight to the source—whether it be funding, sharing ideas or peer review.”

Peer-reviewed results of JOGL micro-grants Round 2. © JOGL

While JOGL is currently collaborating with Kap Code to create an open database on Covid-19, Kaggle has already made available the COVID-19 Open Research Dataset (CORD-19), which includes more than 63,000 scholarly articles about Covid-19, Sars-CoV-2 and related coronaviruses—followed by “a call to action to the world’s artificial intelligence experts to develop text and data mining tools that can help the medical community develop answers to high priority scientific questions”.

At CRI, Marc Santolini (along with his lab postdoctoral associate whose research topic is on “the rise and fall of research fields” from disruption to decay) plans to fully investigate #CoronaResearchDynamics within CORD-19, for example in terms of peer citations and projects realized, in order to compare and contrast with collaborative relationships within JOGL.

“Were they more collaborative or more competitive?” Marc probes. “Can we visualize a comparison between the community of thousands that we managed to weave, who succeeded in collaborating and realizing projects, versus more traditional collaborations and accomplishments in the academic world? In short, we want to analyze the differences between institutional and non-institutional approaches during this sprint in Covid research. What are the particular strengths of open communities?”

Predicting from crowdsourced patterns

Beyond these data-driven metascience approaches on the JOGL community itself, many projects have emerged on the platform that focus on researching and developing data collection methods, analyses, models and simulations to describe and forecast the Covid-19 pandemic.

One of the most ambitious data analysis projects on JOGL (also on the data-oriented CoronaWhy platform) is data scientist John Urbanik’s Computational Epidemiology Modeling Toolkit (#epimodelingtoolkit): an open source toolkit and data exchange for epidemiologists to develop simulation models for the evolution of Covid-19 in specific situations. It’s similar to parallel efforts by contributors to Kaggle’s Covid-19 forecasting challenge, or the academic team behind EpidemicForecasting.org.

Two other OpenCovid19 Initiative projects, both awarded JOGL micro-grants, tackle pandemic problems using de-identified crowdsourced data on a community scale. Their open source approach contrasts with more top-down institutional applications.

Quantified Flu is an ongoing project led by Bastian Greshake Tzovaras and Mad Price Ball of Open Humans, to aggregate, visualize and analyze raw sensor data from wearable health devices (Fitbit, Oura Ring, Google Fit, Apple Watch), which symptom self-tracking members can also access and choose to share. On the institutional side, the government-endorsed COVID Symptom Study iOS/Android app, launched by a genetic epidemiologist at King’s College London and his company Zoe, currently invites millions of citizens in the UK and the U.S. to track their symptoms in real-time, allowing the gatekeeping academics to potentially predict the probability of Covid-19 infection.

CoughCheck is an AI project to develop a smartphone app that analyzes the sound of your cough to inform its algorithmic prediction. Following a highly successful launch on JOGL in March, the project now counts 47 members and 27 followers, while team leader Hernán Morales Durand is collaborating with Open Humans to collect and store audio samples and other community health data from participating self-researchers. In institutional parallel, the COVID-19 Sounds App developed by the University of Cambridge solicits people worldwide to volunteer the sounds of their cough, breathing and voice, either through a mobile app or directly on their website, to contribute to its own academic research.

Simulating the invisible curve

One of JOGL’s inherent strengths as a global, open and collaborative platform is its ability to extend its tentacles beyond biohacking, beyond academia, beyond digital social media and into the dirty, messy offline chaos of real-world communities. These exceptionally dense, marginalized communities of urban slums, migrant worker dormitories and overpopulated refugee camps remind us that social distancing is a luxury that not everyone can afford. While these populations are among the most vulnerable to a Covid outbreak, they are often unaccounted for or overlooked by official predictions.

A couple months ago, data scientist Billy Zhao, co-founder of the AI for Good London community, assembled an international, multidisciplinary team including collaborators from Italy to Ethiopia to focus specifically on studying and modeling the potential spread of Covid-19 in the Moria refugee camp in Greece. In April, the team entered (and won) the international Hack from Home hackathon with a proof of concept for their AI for Good Simulator. The project has since joined JOGL’s OpenCovid19 Initiative, as the team reaches out to an expanded global community.

With thousands of tents pitched in an area of less than one square kilometer that is grossly overcrowded with some 19,000 refugees from Syria and Afghanistan, Moria is the largest refugee camp in Europe. “Some people have been there for years, living in tense situations,” says Billy. “They have a deep distrust in authority, because they should have gotten off the island of Lesbos to go to the mainland ages ago, but the government failed them as a promise. In some cases, there isn’t much community engagement where the camp residents don’t have mobile phones. So it’s hard to get in contact with them, as we really want to find out people’s attitudes towards different possible interventions, to estimate how many people are going to actually follow the rules, what things will be feasible. Right now they’re not even sure if there’s enough land or capacity for quarantine facilities.”

Fortunately, the AI for Good Simulator project’s core team includes Alice Piterova, who has a strong connection with humanitarian actors after working with Techfugees, and Joel Hernandez, an NGO worker who has been working on the ground in Moria for the past several years. So far, Billy has recruited about 20 volunteers through Help with COVID and Data Science for Social Good, with hopefully more soon to come through JOGL. The volunteers are divided into three sub-teams: user research, mathematical epidemiology modeling, and data visualization for the final dashboard. The team is also in regular contact with epidemiologists at the London School of Hygiene and Tropical Medicine.

Together, they are working to design and develop a site-specific epidemiological model to support NGOs in mobilizing actions and inform sound policies by local authorities in a timely manner. “Because once the first death shows up within the camp, the virus is probably everywhere, it might be already too late,” warns Billy.

For the AI for Good Simulator, he also believes that it’s worth taking the time and doing the research to get things right. The team is currently comparing three different models, including a compartment model and an agent-based model based on the academic study of a 2014 cholera outbreak in the Daadab refugee camp in Kenya.

“These three different models are all an extraction of the same reality, but because they have different assumptions, they each have a different makeup that tells you a different story,” says Billy. “So when the models agree from all these angles, you see that this is probably the best intervention we should go for. But when the models disagree, you can actually pick them apart and say, this assumption leads to this, this assumption leads to that, and maybe we need to think about it a little bit more.

“Right now we are still in this first phase of exploring, getting in touch with these different NGOs and trying to figure out their needs, what’s currently being done. There are also academic organizations that are thinking about using high-resolution satellite imagery to see where the congestion spots are within the camp… So there are more AI applications to be done down the road.”

As these projects gain followers on JOGL, the collaborative platform continues to refine its recommendations, matchmaking members and fostering synergies for a sustainable future beyond Covid-19.

“With the heat of the crisis now behind comes the challenge of stabilizing and sustaining such open collaborative projects in the longer term,” says Marc Santolini. “New members of communities that scaled up quickly can easily get lost, and smart onboarding strategies are key to sustaining such efforts. Creating an architecture of attention with recommender systems is key, but their design needs to take into account the specific needs associated with various phases of a project cycle: team building and ideation, implementation, documentation. The JOGL team is now collaborating with social scientists, computer scientists, project managers and user experience specialists to help design this architecture. Ironically, collective intelligence is at the core of its own design.”

Join JOGL’s OpenCovid19 Initiative