Methodology
TrueViews: Methodology for Generating Estimates of Public Opinion at a Variety of Geographic Levels
TrueViews displays estimates of public opinion on many issues and at numerous geographic levels. We draw on data from 18 large-scale surveys of the American public. Specifically, we use the 2009-2023 Cooperative Election Studies (Vavreck and Rivers, 2008; Ansolabehere and Rivers, 2013) and the 2019-2021 UCLA/Nationscape Surveys (Tausanovitch et al., 2019). In all, these surveys include information on the preferences of over 1 million Americans.
Survey | Year | Sample Size |
---|---|---|
Cooperative Election Study | 2009 | 13,800 |
Cooperative Election Study | 2010 | 55,488 |
Cooperative Election Study | 2011 | 20,150 |
Cooperative Election Study | 2012 | 54,535 |
Cooperative Election Study | 2013 | 16,400 |
Cooperative Election Study | 2014 | 56,200 |
Cooperative Election Study | 2015 | 14,250 |
Cooperative Election Study | 2016 | 64,600 |
Cooperative Election Study | 2017 | 18,200 |
Cooperative Election Study | 2018 | 60,000 |
UCLA/Nationscape Survey | 2019 | 149,474 |
Cooperative Election Study | 2019 | 18,000 |
UCLA/Nationscape Survey | 2020 | 311,647 |
Cooperative Election Study | 2020 | 61,000 |
UCLA/Nationscape Survey | 2021 | 33,373 |
Cooperative Election Study | 2021 | 25,700 |
Cooperative Election Study | 2022 | 60,000 |
Cooperative Election Study | 2023 | 24,500 |
We use these survey data on the issue preferences of 1.06 million survey respondents as the foundation for estimating the average preferences of Americans at a variety of geographic levels. First, we break our surveys into four time periods that correspond to presidential terms: 2009-2012, 2013-2016, 2017-2020. and 2021-2023. We provide estimates of the issue preferences of each geographic unit for each of these four time periods.
In order to develop more precise and representative estimates of the preferences of Americans at the subnational level, we use a strategy introduced by Park, Gelman, and Bafumi (2004) and Lax and Phillips (2009) called multilevel regression and poststratification (MRP). This approach incorporates information about respondents’ demographics and geography to estimate the public opinion of each geographic subunit. Specifically, each individual’s survey responses are modeled as a function of demographic and geographic predictors, partially pooling respondents across geographic units to an extent determined by the data. Thus, all individuals in the survey yield information about demographic and geographic patterns, which can be applied to all subunit’s estimates. Several recent studies have found that MRP models yield accurate estimates of public opinion in states, congressional districts, and cities (Park, Gelman, and Bafumi, 2004; Lax and Phillips, 2009; Warshaw and Rodden, 2012; Tausanovitch and Warshaw, 2013).
Specifically, we estimate an MRP model of Americans' policy preferences at the level of zipcode tabulation areas (ZCTA). In general, our MRP model is like the one in Tausanovitch and Warshaw (2013). There are two stages to the MRP models. In the first stage of each model, we estimate each individual’s policy preferences as a function of his or her demographics (race, gender, and education) and geographic location. We assume that the “geographic effects” in the model are a function of a vector of factors that previous studies have found to influence constituency preferences. Specifically, the ZCTA random effects are modeled as a function of the state into which the ZCTA falls, its county, its median household income, population density, the percent of residents that are veterans, the percent of couples that are in same-sex relationships, the percent non-white, the percent with a graduate degree, the percent of people in college, and the two-party vote share in the previous presidential election (Ansolabehere and Rodden, 2012; Chen and Stephanopoulos, 2020; Voting and Election Science Team, 2018, 2020). We model county-level random effects based on the proportion of various religious groups in the county (U.S. Religion Census).
We incorporate respondents’ sampling weights into our model using the approach in Gelman et al (2024), which proposes a joint regression of the outcome and the sampling weight in the MRP model.
The second stage is post-stratification. In this stage, we use the multilevel regression to make a prediction of public opinion in each demographic-geographic subtype. The estimates for each demographic-geographic subtype are then weighted by the percentages of each strata in the actual populations 1. Finally, these predictions are summed to produce an estimate of the issue preferences of the mass public in each ZCTA.
Lastly, we aggregate our estimates of the mass public's preferences in each ZCTA to a variety of other geographic units based on the Geographic Correspondence Engine from the Missouri Census Center. Specifically, we aggregate our estimates to the level of:
- Cities
- County Subdivisions
- Counties
- States
- School Districts
- State Legislative Districts
- Congressional Districts
1. We use estimates of the cross tabulation of the population over 25 by sex, education, and race/ethnicity from the 5-year ACS samples, which we downloaded from NHGIS (Manson et al., 2021).