This package provides an R interface to the United States Census Bureau’s data API. The primary focus, as per its name, is pulling information from the detailed tables of the American Community Survey. However, it will work for pulling data from any dataset in the Census API. A great resource for understanding the complexities of Census data is https://censusreporter.org/.
Installation
You can install the development version from GitHub with:
# install.packages("remotes")
remotes::install_github("higherX4Racine/hercacstables")
Fetching data from the API
The main way that one uses hercacstables
to interact with the Census API is the fetch_data()
function.
A vanilla call to fetch_data()
Here is a modestly complicated use of the function without any setup or post-processing.
POPS_AND_HOUSEHOLDS <- hercacstables::fetch_data(
# the API works one year at a time
year = hercacstables::most_recent_vintage("acs", "acs1"),
# the API can only query one data source at a time
survey_type = "acs", # the American Community Survey
table_or_survey_code = "acs1", # the 1-year dataset of the ACS
# the API fetches values for one or more instances of a specific geography
for_geo = "state", # fetch values for entire states
for_items = c(
"11", # the District of Columbia
"72" # Puerto Rico
),
# the codes for the specific data that we are fetching
variables = c(
"NAME", # the geographic area's name
"B01001_001E", # the total number of people
"B11002_002E", # people in family households
"B11002_012E" # people in nonfamily households
)
)
NAME | state | Group | Index | Value | Year |
---|---|---|---|---|---|
District of Columbia | 11 | B01001 | 1 | 678,972 | 2023 |
District of Columbia | 11 | B11002 | 2 | 406,283 | 2023 |
District of Columbia | 11 | B11002 | 12 | 235,088 | 2023 |
Puerto Rico | 72 | B01001 | 1 | 3,205,691 | 2023 |
Puerto Rico | 72 | B11002 | 2 | 2,613,461 | 2023 |
Puerto Rico | 72 | B11002 | 12 | 554,557 | 2023 |
Best practice
The previous example is good enough for a README file. An approach centered on reusability is better for actual practice.
Identifying which items to fetch
People working with the Census often know the topic that they are interested in, but not the specific variables that provide information about that topic. The hercacstables
package has utilities that help you search for Census variables if you don’t already know which ones you want to use.
Search for groups with keywords
A good first step is to search for tables that could be relevant with the search_in_columns()
function and the built-in METADATA_FOR_ACS_GROUPS
.
EDUCATION_TABLES <- hercacstables::search_in_columns(
hercacstables::METADATA_FOR_ACS_GROUPS,
Group = "\\d$", # "Group" values need to end in a digit.
Universe = "population", # "Universe" values need to include "population".
Description = "enroll", # "Description values need to include "enroll"
`-Description` = c("sex", # but not "sex", that's too much detail
"computer", # or "computer", also too detailed
"quarters", # or "quarters", also too detailed
"insur", # or "insur", not health care enrollments
"allocation") # or "allocation", these are metadata
)
Group | Universe | Description | ACS1 | ACS5 |
---|---|---|---|---|
B14001 | Population 3 years and over | School Enrollment by Level of School for the Population 3 Years and Over | TRUE | TRUE |
B14006 | Population 3 years and over for whom poverty status is determined | Poverty Status in the Past 12 Months by School Enrollment by Level of School for the Population 3 Years and Over | TRUE | TRUE |
B14007 | Population 3 years and over | School Enrollment by Detailed Level of School for the Population 3 Years and Over | TRUE | TRUE |
C14002 | Population 3 years and over | School Enrollment by Level of School by Type of School for the Population 3 Years and Over | TRUE | FALSE |
C14003 | Population 3 years and over | School Enrollment by Type of School by Age for the Population 3 Years and Over | TRUE | FALSE |
Unpack variables for a group
Once you know the group that you are interested in, you will want to see what information is captured in each of its rows. The unpack_group_details()
function searches the built-in METADATA_FOR_ACS_VARIABLES
for all of the variables of one group. It then expands the Census’s description of each variable into columns. Users will need to identify the concepts that are captured by each column.
The following example unpacks the variables in the B14001 table.
UNPACKED_B14001 <- hercacstables::unpack_group_details("B14001")
Group | Index | Variable | A | B |
---|---|---|---|---|
B14001 | 1 | B14001_001E | ||
B14001 | 2 | B14001_002E | Enrolled in school | |
B14001 | 3 | B14001_003E | Enrolled in school | Enrolled in nursery school, preschool |
B14001 | 4 | B14001_004E | Enrolled in school | Enrolled in kindergarten |
B14001 | 5 | B14001_005E | Enrolled in school | Enrolled in grade 1 to grade 4 |
B14001 | 6 | B14001_006E | Enrolled in school | Enrolled in grade 5 to grade 8 |
B14001 | 7 | B14001_007E | Enrolled in school | Enrolled in grade 9 to grade 12 |
B14001 | 8 | B14001_008E | Enrolled in school | Enrolled in college, undergraduate years |
B14001 | 9 | B14001_009E | Enrolled in school | Graduate or professional school |
B14001 | 10 | B14001_010E | Not enrolled in school |
Shortcuts
The package also includes functions for common special cases. They are wrappers for fetch_data()
with some arguments hard-coded.
Decennial populations by race and ethnicity
One example is pulling trends of racial/ethnic populations from the decennial census for some specific level of geography.
POPS_BY_RACE <-
hercacstables::fetch_decennial_pops_by_race(
for_geo = "state", # one cannot fetch the whole nation from 2000 or 2010
for_items = "*" # so we pull the data for every state
) |>
dplyr::count( # then compute the nationwide populations for each vintage
.data$`Race/Ethnicity`,
.data$Vintage,
wt = .data$Population,
name = "Population"
) |>
tidyr::pivot_wider( # and reshape the table for display
names_from = "Vintage",
values_from = "Population"
)
Race/Ethnicity | 2000 | 2010 | 2020 |
---|---|---|---|
All | 285,230,516 | 312,471,327 | 334,735,155 |
American Indian and Alaska Native | 2,069,446 | 2,247,427 | 2,252,011 |
Asian | 10,126,044 | 14,468,054 | 19,621,465 |
Black or African American | 33,952,901 | 37,690,511 | 39,944,624 |
Hispanic or Latino | 39,068,564 | 54,166,049 | 65,329,087 |
Native Hawaiian and Other Pacific Islander | 353,874 | 481,653 | 622,109 |
Some other race | 468,155 | 605,291 | 1,692,341 |
Two or more races | 4,604,792 | 5,967,844 | 13,551,323 |
White | 194,586,740 | 196,844,498 | 191,722,195 |