The purpose of a survey is to make an inference from a sample of respondents to a population. The historic gold standard approach involves drawing a probability sample of the population. In this case, every member of the population has a known, non-zero chance of being in the sample. A non-probability survey is one where there is no random selection. While some evidence suggests that estimates from non-probability samples are not as accurate as those from probability samples (MacInnis et al. 2018, Bradley et al. 2021, Mercer and Lau 2023), others show that non-probability samples can offer accurate estimates (Gelman et al. 2016, Enns and Rothschild 2021, Holliday et al. 2021).

We first discuss the CHIP50’s approach to data collection and inference, and then we discuss validation efforts and the advantages of CHIP50 data.


CHIP50 Approach

CHIP50 relies on non-probability samples. As such, we emphasize the following points.

  1. Sample Type. There is substantial variation in the quality of non-probability (and probability) surveys. The American Association for Public Opinion Research’s Task Force Report on non-probability sampling, states “Researchers and other data users may find it useful to think of the different nonprobability sample approaches as falling on a continuum of expected accuracy of the estimates” (Baker et al. 2013). The report also states “non-probability sampling is a collection of methods rather than a single method, and it is difficult if not impossible to ascribe properties that apply to all non-probability sampling methodologies.” It is misleading to make general blanket statements about “non-probability surveys.”
  1. Recruitment. CHIP50 survey participants are recruited by 22 vendors who use a variety of strategies and incentives to maintain online respondent panels. Respondents are channeled to surveys through PureSpectrum, an online polling platform that works with panel providers from all 50 states and D.C. PureSpectrum provides initial respondent identification, deduplication, and screening for quality, demographic qualifications, and location. Using the PureSpectrum API, the CHIP50 team has developed a system to launch and monitor multiple survey projects in all states and D.C., based on pre-specified quotas and qualifications (see below). The CHIP50 team monitors recruitment in real-time.
  1. Sample Quotas and Weights. CHIP50 surveys employ extensive sampling quotas, at both the national and state levels, as well as post-stratification weights (a prevalent practice for non-probability and probability samples). These efforts increase the representativeness of the samples relative to the population when it comes to the measured variables (Vehovar et al. 2016).
    • For sample quotas on both national and state levels, we use gender (Male, Female), age (18-24, 25-34, 35-44, 45-54, 55-64, 65-99), and race (African American, Asian, Hispanic, White, Other) categories. Each national and state-level quota is determined by the corresponding most recent U.S. Census Bureau data. We also collect education level, employment, household income, children (count, gender, age), and relationship data without applying quotas. Depending on the anticipated sample size and convenience of recruitment in each state, we use nested/interlocked quotas for gender, age, and race to achieve a more representative sample. Examples of states to which we apply such quota interlocking include California, Texas, New York, and Florida. Examples of states where we apply quotas without interlocking include Alaska, Wyoming, Delaware, and Rhode Island (i.e., states in which online sampling is more difficult due to population sizes).      
    • To generate post-stratification weights, CHIP50 uses U.S. Census Bureau data for population demographics including race/ethnicity, age, gender, education, and geographic region. We use NCHS urban-rural classification data for urbanicity. For national-level weights, CHIP50 also uses geographic regions and interlocking gender-by-age-by-race categories. For state-level analyses, weights are generated using non-interlocking state-level demographic categories.
    • A second set of national and state weights is produced to match the population with regard to all the previously mentioned parameters, and it also includes 2020 vote choice and turnout. Those weights are appropriate for politically sensitive analyses where partisanship bias is especially important (Pew Research Center 2021).
  1. Measurement. Response validity refers to the extent to which survey answers are “obtained under conditions of appropriate motivation, adequate effort, and honesty on the part of…respondents” (Lovett et al. 2022). Concerns about response validity are particularly relevant in (probability and non-probability) surveys that use an online response mode (Callegaro et al. 2014).
    • CHIP50 surveys weed out bots by including a CAPTCHA test.
    • To enhance response validity (and identify and remove bogus respondents), CHIP50 excludes any respondent who fails one of two basic attention checks that straightforwardly ask respondents to choose a specific response option. The  remaining respondents are evaluated based on their performance on a series of quality checks, including: (1) duplicate responses, (2) prevalence of item non-response, (3) short survey completion time, (4) straight-lining consecutive item responses (5) selecting a large number of occupations (6) selecting a large number of religions (7) reporting a household with over 20 family members (8) reporting over 10 activities outside their home in the past 24 hours, (9) identifying as a “Democrat” and “very conservative”; or a “Republican” and “very liberal”, and (10) giving a meaningless answer to an open-ended question requesting a state name. Respondents who fail multiple quality checks are removed from the data. This filtering affects a relatively small percentage of respondents who completed the survey (1%-3%). The plurality of removed completed responses (typically around 25%) resulted from failed attention checks.
  1. Inferences. In making inferences from data, statistical significance tests and confidence intervals with nonprobability samples are technically inappropriate. Yet, CHIP50 follows the common practice in using these approaches, following the dictum “to be more openly accepting of the reality of using a standard statistical inference approach as an approximation in non-probability settings” as long as the nature of how the sample is drawn is clear (Vehhovar et al. 2016; also see Groves et al. 2009, p. 409-410).
  1. Reporting.
    • The completion rate (i.e., the number of respondents who complete the survey of those who begin it), the screening completion rate (i.e., the number of respondents who meet the data quality threshold of those who complete the survey). Generally, about 15%-20% of those who start the survey do not finish (attrition).
    • All surveys employ the recruiting, quota, and weighting strategies explained above.
    • Any deviations from the protocol described on this page will be reported on this page.


CHIP50 Validation and Advantages

  1. Validation.  Extensive efforts have been made to assess the accuracy of estimates from CHIP50 surveys.
    • Estimates closely match COVID-19 vaccination rates produced by the CDC (Green et al. 2023), and a major national probability survey (Quintana-Mathe et al. 2024).
    • Estimates closely match COVID-19 infection rates as tracked by Johns Hopkins University (that relies on the CDC and state-level estimations) (Quintana-Mathe et al. 2024) and wastewater estimates (Santillana et al. 2024).
    • Estimates closely match the two-party vote share in the 2020 elections as estimated by major election polls (Radford et al. 2022).
    • Estimates closely match other surveys or administrative data across domains including BLM protests (Simonson et al. 2024), experiencing symptoms of depression (Baum et al. 2024), and gun purchasing (Lacombe et al. 2022).
    • CHIP50 data have been used for more than 100 public reports, more than a dozen published op-eds and commentaries, and more than 25 peer-reviewed academic publications (see https://www.chip50.org/papers). It also has been utilized by a large number of researchers who are not affiliated with the CHIP50 team, as well as has been featured in over 1,000 newspaper, radio, and television news reports (e.g., Wall Street Journal, Bloomberg, The New York Times, The Washington Post, Los Angeles Times, PBS NewsHour, NPR, Politico, CBS News, CNN, MSNBC).
  1. Advantages of CHIP50. The CHIP50 approach to data collection enables it to have five attributes that are rare in other data collections.
    • State Data. The CHIP50 data allow for inferences at the state level. This is a crucial attribute given much of politics and public health involve precise state dynamics (Grumbach 2022, Woolf 2022).
    • Subgroups. The CHIP50 data allow for inferences about smaller population subgroups (e.g., based on race/ethnicity). This is a crucial attribute given the common exclusion of smaller strata from data collections (Welles 2014).
    • Over-time Data. CHIP50 data are available over extended time periods, starting in April 2020. This allows for tracking social, political, economic, trust, and public health trends.
    • Collaborative. CHIP50 invites other researchers to apply for space on surveys (https://www.chip50.org/proposals/how-to-submit). Researchers gain access to state-level data and/or a large sample to study topics that advance their agendas.
    • Low cost. CHIP50 makes state- and subgroup-level survey research accessible to a far wider range of researchers than traditional survey instruments. That’s because, compared to traditional probability surveys, CHIP50 is inexpensive. CHIP50 data costs are approximately 99% less than comparable probability samples.