Skip to contents

Let’s say that you are a data intern for a wastewater agency in the southeastern part of Wisconsin. Your mentors have asked you to find insights from the American Community Survey.

Specifying your community

They tell you that the wastewater commission’s area of responsibility is essentially identical to the local school district’s, Racine Unified.

You look for geographical levels related to schools in hercacstables::METADATA_FOR_ACS_GEOGRAPHIES.

SCHOOL_GEOGRAPHIES <- dplyr::filter(hercacstables::METADATA_FOR_ACS_GEOGRAPHIES,
                                    grepl("school", .data$`Geographic Level`))

This table tells us two things. First, the geographic level that you should examine is “school district (unified)”. Second, you also need to specify the state when you’re fetching data about it. With hercacstables::fetch_data(), and tools from the tidyverse, you can find the FIPS codes for unified school districts in Wisconsin that have the word “Racine” in them.

RACINE_DISTRICTS <- "NAME" |>
    hercacstables::fetch_data(
        variables = _,
        year = 2020,                           # the most recent decennial
        for_geo = "school district (unified)", # see above
        for_items = "*",                       # this wildcard gets every item
        survey_type = "dec",                   # not actually the ACS *flex*
        table_or_survey_code = "pl",           # this data set is similar to ACS
        state = 55L                            # the Badger State
    ) |>
    dplyr::filter(
        stringr::str_detect(.data$NAME, "Racine")
    )
NAME state school district (unified) Year
Racine School District, Wisconsin 55 12360 2020

Finding relevant data

The first thing that you could do is to check for data related to plumbing. If you were doing this without R, you might visit data.census.gov and search for “ACS plumbing”.

Groups with relevant data

With hercacstables, you would search within its glossary about ACS groups:

PLUMBING_GROUPS <- hercacstables::METADATA_FOR_ACS_GROUPS |>
    dplyr::filter(
        grepl("plumbing",         # search for the term "plumbing"
              .data$Description,  # within the "Description" column
              ignore.case = TRUE) # ignore capitalization
    )
Group Universe Description ACS1 ACS5
B25016 Occupied housing units Tenure by Plumbing Facilities by Occupants per Room TRUE TRUE
B25047 Housing units Plumbing Facilities for All Housing Units TRUE TRUE
B25048 Occupied housing units Plumbing Facilities for Occupied Housing Units TRUE TRUE
B25049 Occupied housing units Tenure by Plumbing Facilities TRUE TRUE
B25050 Occupied housing units Plumbing Facilities by Occupants per Room by Year Structure Built TRUE TRUE
B99259 Housing units Allocation of Plumbing Facilities TRUE TRUE

It appears that the keyword “plumbing” appears in 6 different ACS groups.

Specific variables about plumbing

Let’s say that “Tenure1 by plumbing facilities” jumps out to you as potentially interesting.

GROUP <- "B25049"

There hercacstables also provides glossary about variables. You can search it to see details about each variable in group B25049.

PLUMBING_VARIABLES <- hercacstables::METADATA_FOR_ACS_VARIABLES |>
    dplyr::filter(
        .data$Dataset == "ACS5", # use the 1-year dataset only
        .data$Group == GROUP     # pull all of the variables for this group
    )
Dataset Group Index Variable Details
ACS5 B25049 1 B25049_001E
ACS5 B25049 2 B25049_002E Owner occupied
ACS5 B25049 3 B25049_003E Owner occupied , Complete plumbing facilities
ACS5 B25049 4 B25049_004E Owner occupied , Lacking plumbing facilities
ACS5 B25049 5 B25049_005E Renter occupied
ACS5 B25049 6 B25049_006E Renter occupied , Complete plumbing facilities
ACS5 B25049 7 B25049_007E Renter occupied , Lacking plumbing facilities

Unpacking variable details

You can see that group B25049 reports the number of households that have, or do not have, plumbing facilities. It further breaks those households down into renters and owner-occupants. The raw glossary from the Census, and in hercacstables::METADATA_FOR_ACS_VARIABLES, packs all of that information into the “Details” column.

It would be more useful if we actually had separate columns for the types of tenure and plumbing facilities. You can use hercacstables::unpack_group_details() to do this, in combination with tools from the tidyverse.

You might also decide that you don’t the rows that report the total number of households (row 1) and subtotals by tenure (2 and 5). You’ll have that data anyway from 3, 4, 6, and 7.

PLUMBING_VARIABLES <- GROUP |>
    hercacstables::unpack_group_details() |>
    dplyr::filter(
        .data$Dataset == "ACS5"
    ) |>
    dplyr::rename(
        Tenure = "A",
        Plumbing = "B"
    ) |>
    dplyr::filter(
        dplyr::if_all(c("Tenure", "Plumbing"),
                      \(.) (nchar(.) > 0))
    )
Dataset Group Index Variable Tenure Plumbing
ACS5 B25049 3 B25049_003E Owner occupied Complete plumbing facilities
ACS5 B25049 4 B25049_004E Owner occupied Lacking plumbing facilities
ACS5 B25049 6 B25049_006E Renter occupied Complete plumbing facilities
ACS5 B25049 7 B25049_007E Renter occupied Lacking plumbing facilities

Fetch data

You are finally ready to pull a some census data about plumbing! For starters, you could compare the local area to the whole county, the state, its census region, and the whole country. Once again, we’ll use hercacstables::fetch_data() and the tidyverse.

YEAR <- hercacstables::most_recent_vintage("acs", "acs5")

RAW_PLUMBING <- tibble::tribble(
    ~ geo,                       ~ items, ~ other,
    "us",                        "1",     list(),
    "region",                    "2",     list(),
    "state",                     "55",    list(),
    "county",                    "101",   list(state = 55),
    "school district (unified)", "12360", list(state = 55)
) |>
    purrr::pmap(
        \(geo, items, other) rlang::inject(
            hercacstables::fetch_data(
                variables = c("NAME", PLUMBING_VARIABLES$Variable),
                year = YEAR,
                for_geo = geo,
                for_items = items,
                survey_type = "acs",
                table_or_survey_code = "acs5",
                !!!other
            )
        )
    )
NAME us Group Index Value Year
United States 1 B25049 3 82628718 2023
NAME region Group Index Value Year
Midwest Region 2 B25049 3 19041735 2023
NAME state Group Index Value Year
Wisconsin 55 B25049 3 1655255 2023
NAME state county Group Index Value Year
Racine County, Wisconsin 55 101 B25049 3 56159 2023
NAME state school district (unified) Group Index Value Year
Racine School District, Wisconsin 55 12360 B25049 3 38779 2023

The data in RAW_PLUMBING do not have much meaning without the information in PLUMBING_VARIABLES. Fortunately, we can use the tidyverse to join the two tables together.

PLUMBING <- RAW_PLUMBING |>
    purrr::map(
        \(.) dplyr::select(.,
                           "Group",
                           "Index",
                           Geography = "NAME",
                           Households = "Value"
        )
    ) |>
    purrr::list_rbind() |>
    dplyr::right_join(
        PLUMBING_VARIABLES,
        by = c("Group", "Index")
    ) |>
    dplyr::select(
        "Geography",
        "Tenure",
        "Plumbing",
        "Households"
    )
knitr::kable(PLUMBING)
Geography Tenure Plumbing Households
United States Owner occupied Complete plumbing facilities 82628718
United States Owner occupied Lacking plumbing facilities 263319
United States Renter occupied Complete plumbing facilities 44349427
United States Renter occupied Lacking plumbing facilities 241401
Midwest Region Owner occupied Complete plumbing facilities 19041735
Midwest Region Owner occupied Lacking plumbing facilities 55380
Midwest Region Renter occupied Complete plumbing facilities 8561887
Midwest Region Renter occupied Lacking plumbing facilities 43001
Wisconsin Owner occupied Complete plumbing facilities 1655255
Wisconsin Owner occupied Lacking plumbing facilities 5250
Wisconsin Renter occupied Complete plumbing facilities 781225
Wisconsin Renter occupied Lacking plumbing facilities 4298
Racine County, Wisconsin Owner occupied Complete plumbing facilities 56159
Racine County, Wisconsin Owner occupied Lacking plumbing facilities 124
Racine County, Wisconsin Renter occupied Complete plumbing facilities 22809
Racine County, Wisconsin Renter occupied Lacking plumbing facilities 17
Racine School District, Wisconsin Owner occupied Complete plumbing facilities 38779
Racine School District, Wisconsin Owner occupied Lacking plumbing facilities 47
Racine School District, Wisconsin Renter occupied Complete plumbing facilities 17242
Racine School District, Wisconsin Renter occupied Lacking plumbing facilities 12

Finally, you can do some tidyverse-only magic to look at the different rates of plumbing for renters versus owners.

RATES_OF_PLUMBING <- PLUMBING |>
    dplyr::mutate(
        dplyr::across(c("Tenure", "Plumbing"),
                      \(.) stringr::str_extract(., "^\\S+"))
    ) |>
    tidyr::pivot_wider(
        names_from = "Tenure",
        values_from = "Households"
    ) |>
    dplyr::mutate(
        `All Households` = .data$Owner + .data$Renter
    ) |>
    tidyr::pivot_longer(
        cols = c("Renter", "Owner", "All Households"),
        names_to = "Tenure",
        values_to = "Households"
    ) |>
    tidyr::pivot_wider(
        names_from = "Plumbing",
        values_from = "Households"
    ) |>
    dplyr::mutate(
        Households = .data$Complete + .data$Lacking,
        Rate = .data$Complete / .data$Households
    ) |>
    dplyr::select(
        "Geography",
        "Tenure",
        "Rate"
    ) |>
    tidyr::pivot_wider(
        names_from = "Tenure",
        values_from = "Rate"
    )
Geography Renter Owner All Households
United States 99.46% 99.68% 99.60%
Midwest Region 99.50% 99.71% 99.64%
Wisconsin 99.45% 99.68% 99.61%
Racine County, Wisconsin 99.93% 99.78% 99.82%
Racine School District, Wisconsin 99.93% 99.88% 99.89%

Three things might jump out at you. The first is that almost every household in the US has complete plumbing facilities. The second thing is that plumbing access seems a little different for renters and owners. The third thing is that who has more access to plumbing is different in Racine than at larger geographic levels. At the scale of the nation, region, and state, renters are a little less likely to have complete access to plumbing. In contrast, renters in Racine County and the Racine Unified School District are actually a little more likely to have access to plumbing.