Tracking COVID-19 Symptoms and Impact in Real Time: A Survey-Based System
Federal Reserve Bank of Minneapolis
Containment of the coronavirus and supporting those afflicted requires that high quality data be collected at unprecedented speed. In recent weeks, policymakers and researchers have turned to new sources of data that reflect “traces” of virus activity that appear in online applications, like Google searches or personal health apps. This approach allows frequent updating, but these data may not be representative of the general population and it is therefore not known how well-suited they are to addressing key health and economic issues. A viable and better alternative is a daily survey administered through a partnership between the government and the private sector. This survey would be conducted by combining established population survey methods from social science and public health with advances in electronic data collection in the private sector. Such a survey could reveal, in a very timely manner, whether and where infection rates might be rising as well as how the public is faring under the social and economic restrictions in place.
- The importance of having accurate and timely data on the epidemic and its impacts cannot be overstated, as these data will be critical to safely reopening the economy. The dominant containment approach since the onset of the pandemic has been extensive movement restrictions. These have come at an unprecedented cost of forgone economic output and employment disruption. Experts caution that the pandemic may require 18 to 24 months to run its full course, and that a vaccine in under 12 months is unlikely. In this environment, timely data on potential virus re-emergence could facilitate safely relaxing movement restrictions, which would help to curtail some of the economic losses. Relaxing restrictions might occur at the place level (open up counties where virus load is low) or the person level (allow more movement among those who are low risk for virus complications). Either approach requires timely, detailed data on potential virus re-emergence as well as the impact of restrictions on affected individuals and communities, in order to tailor aid and incentives to comply with ongoing restrictions.
- A survey tailored to data needs around COVID-19 can provide badly needed information that is not available elsewhere. The past three weeks have seen heroic efforts to use data to inform our nation’s response to COVID-19. Existing surveys — like the Pew American Trends panel, the Understanding America Study from USC, and the Board of Governors’ Survey of Household Economic Decision-making — have quickly added questions to assess the impact of COVID-related restrictions. Tech companies are beginning to share some of their big data on behavior and interactions to help track the outbreak, including the partnership between Google and Apple. But both types of efforts entail critical shortcomings that a tailored approach could overcome. Existing survey efforts that can be quickly redirected to new questions are typically small in scope and can tell us little about geographic and population variation. Additionally, the sample properties of much big data are unknown. This is because apps and other internet-enabled devices provide data on an “opt-in” basis. Only people who choose to participate in the app or buy the device will be reflected in the data. The volume of data may be large, but non-representativeness may limit its value. To take a popular example, internet-enabled thermometers can track body temperature across locations in the US, but given that such thermometers retail for around $70, it is unclear these data are representative of the full range of U.S. households.
- Rapid implementation of a large-scale but brief survey would provide a nationally representative view of the pandemic's progression and impact, as well as the effect of the measures taken to confront the crisis. Such a survey would also allow researchers and policymakers to zoom in to get information at the local level. A survey may overcome some of the problems that widespread medical screening is encountering: namely, we already have the technology [know how] to deploy a large survey, and it can be accomplished without using medical resources that might be better used elsewhere. Surveys can also be accomplished without exposing people to potential virus contact. Finally, surveys can go beyond disease spread to measuring the impact of restrictions to combat it on mental health and household finances. This is important because extensions of existing surveys or big data archives tend to provide a limited picture of virus impacts, since they were not designed to collect a full range of indicators at once.
- How would this work? A successful survey would need to be frequently administered to a large group of people for an extended period. The epidemic is fast-moving, and restrictions change quickly. Daily data is ideal to monitor the spread of the virus across groups and places, and also to track how people are impacted by the threat of the disease and the restrictions imposed by policy. Large samples allow for tailored analysis at the state, metro area, and urban/rural levels as well as for distinct estimates for local populations by age, ethnic and racial background, and other dimensions. The American Community Survey, our nation’s largest survey, interviews over 2 million households annually in order to get data with these distinctions. A daily survey of 2 million respondents would provide the desired geographic and population detail, but such a large survey may not be essential to achieving this information. Polling organizations often generate representative statistics for small areas across the U.S. with sample sizes in the tens of thousands. Large samples may also help address concerns about whether the data can distinguish emerging COVID-19 cases from common symptoms due to other causes. In principle, this survey would allow the use of big data techniques to precisely detect small shifts in symptom clusters in large populations. Continuing the survey over an extended period – 18 to 24 months, the potential course of the full pandemic – will enable the ongoing monitoring that is key to informed decision-making.
- The survey would track indicators of mental health distress and financial hardship along with COVID-19 symptoms. Tracking the impact of containment restrictions is just as important as tracking the spread of the virus through symptoms. Including questions to quickly gauge mental and financial health would provide policymakers with real-time information on how their specific communities are faring financially and emotionally. This information is critical for targeting support to the vulnerable populations who are likely to suffer disproportionate economic, social and health consequences from the virus. Surveys in social science already routinely measure these outcomes. Although some new questions will need to be developed quickly to reflect the new environment, many questions with vetted language and well-known properties are already available in existing surveys. There are many examples of government surveys collected by private firms under contract. The partnerships extend strong privacy protections to respondents. By contrast, private firms collecting their own data often offer privacy assurances, but there is no legal protection or professional ethical practice that guarantees this.
- A survey of this nature would require a Public/Private Partnership. The largest population surveys administered by our government statistical agencies – like the Census’ American Community Survey, mentioned above, and the Behavioral Risk Factor Surveillance Survey, administered to 400,000 respondents by the CDC in partnership with state health departments – require a year to survey their full sample of half a million to two million respondents. In contrast, private sector companies use mobile apps and websites to reach half a million to two million people daily. Furthermore, they are able to process these data quickly using the newest technology. To take just two examples, the Pokemon Go app was downloaded by 1 billion users worldwide and processed data for over 100 million daily participants at its height. It has been widely reported that Facebook and Google regularly conduct experiments on large numbers of users, but the extent and results of these are not released publicly. Private sector companies can also enable the collection of higher quality data, for example using a health app that monitors body temperature or heart rate. But government agencies are vital to this effort, for example to include hard-to-reach and underrepresented populations, and to offer the legal protections for these data that are already part of existing health surveys. Hard-to-reach populations can be surveyed by phone if they can first be intentionally recruited using reliable address-based sampling frames, or through community partners and advertising partnerships, all of which have been used in the 2020 Census outreach.
- A prototype of this kind of survey will launch in April through a partnership between the Data Foundation and the National Opinion Research Center. Additional information about the survey, including the full questionnaire, can be found at covid-impact.org. The prototype will be more limited in geographic scope and frequency, but it will collect weekly information for the U.S. as well as a combination of 18 states and large cities. The full proposed survey instrument will be administered, asking questions from all three areas. For example, respondents will be asked to self-report their body temperature and underlying health problems. They will also be asked about social connections both prior to and after the onset of COVID-19, and about their financial and work situation. Respondents will come from a combination of a standing survey panel (the NORC AmeriSpeaks panel) and direct recruitment from a national address registry like that used by Census for sampling sub-national geographic areas. Complete micro-data will be available free of charge at covid-impact.org in late April 2020, along with a range of tabulations on all questions by geographic area.
What this Means:
The unprecedented nature of the coronavirus crisis presents many challenges to policy-makers at the federal, state and local level. Having accurate and timely information is key to being able to respond to the problems generated, assess the effectiveness of the response, and plan ahead. This survey addresses three major informational needs in the presence of COVID-19. First, data on the underlying health of the general population can signal additional potential outbreaks and inform models of disease progression under various containment alternatives. It also may substitute for random population testing. Second, people who suffer most from containment restrictions—in terms of their mental and financial health—often aren’t the same as people who suffer from the illness itself. The proposed survey would improve tracking of the former, who are currently not directly represented in any major data collection effort. And finally, the scale and frequency of the survey would make this information available at the same levels at which many policy decisions are made and social services delivered: our counties, cities, states, and communities.