Load Current Population Survey (CPS) microdata into R using the Census Bureau Data API, including basic monthly CPS and CPS ASEC microdata.
Note: This product uses the Census Bureau Data API but is not endorsed or certified by the Census Bureau.
For a Python version of this package, check out PyCPS.
To install cpsR, run the following code:
install.packages("cpsR")
To install the development version of cpsR, run the following code:
# install.packages("devtools")
::install_github("matt-saenz/cpsR") devtools
In order to use cpsR functions, you must supply a Census API key in one of two ways:
key
argument (manually)CENSUS_API_KEY
(automatically)Using environment variable (or env var, for short)
CENSUS_API_KEY
is strongly recommended for two reasons:
It is important to avoid including your key in scripts if you plan to share your code with others (like in the example below) since you should keep your key secret.
You can set up env var CENSUS_API_KEY
in two steps:
First, open your .Renviron
file. You can do so by
running:
# install.packages("usethis")
::edit_r_environ() usethis
Second, add your Census API key to your .Renviron
file
like so:
CENSUS_API_KEY='your_key_here'
This enables cpsR functions to automatically look up your key by running:
Sys.getenv("CENSUS_API_KEY")
library(cpsR)
library(dplyr)
library(purrr)
# Simple use of the basic monthly CPS
<- get_basic(
sep21 year = 2021,
month = 9,
vars = c("prpertyp", "prtage", "pemlr", "pwcmpwgt")
)
sep21#> # A tibble: 103,858 × 4
#> prpertyp prtage pemlr pwcmpwgt
#> <int> <int> <int> <dbl>
#> 1 2 80 5 1361.
#> 2 2 85 5 1411.
#> 3 2 80 5 4619.
#> 4 2 80 5 4587.
#> 5 2 42 1 3677.
#> 6 2 42 1 3645.
#> 7 1 9 -1 0
#> 8 2 41 1 3652.
#> 9 2 32 7 4117.
#> 10 2 67 1 2479.
#> # ℹ 103,848 more rows
%>%
sep21 filter(prpertyp == 2 & prtage >= 16) %>%
summarize(
pop16plus = sum(pwcmpwgt),
employed = sum(pwcmpwgt[pemlr %in% 1:2])
%>%
) mutate(epop_ratio = employed / pop16plus)
#> # A tibble: 1 × 3
#> pop16plus employed epop_ratio
#> <dbl> <dbl> <dbl>
#> 1 261765646. 154025931. 0.588
# Pulling multiple years of CPS ASEC microdata
<- map_dfr(2020:2021, get_asec, vars = c("h_year", "marsupwt"))
asec
count(asec, h_year, wt = marsupwt)
#> # A tibble: 2 × 2
#> h_year n
#> <int> <dbl>
#> 1 2020 325268182.
#> 2 2021 326195440.