'Importing Census Data from IPUMS -- adding weights

I'm trying to import census data from IPUMS into R but am not sure how to account for weights.

  1. I extracted 41 variables spanning from 2000-2020. This dataset is called usa_00001.xml (data dictionary attached).

  2. I took a look at the codebook for the imported data set to narrow down the list of variables for my analysis. Based on my review of the codebook, I decided to focus more on family structure, income, race/ethnicity, and education. Any variables that I determined would not prove useful were dropped from the new data set (data_clean1).

Variables: (1) year = year of census, stateicp, hhincome, nmothers, nfathers, nchild, hispan, race, educd, inctot, educd_mom, educd_pop, inctot_mom, and inctot_pop.

ddi <- read_ipums_ddi("usa_00001.xml")
data <- read_ipums_micro(ddi)
makeCodebook(data, replace=TRUE, output = "pdf")
data_clean1 <- data %>%
  select(YEAR, STATEICP, HHINCOME, NMOTHERS, NFATHERS, NCHILD, HISPAN, RACE, EDUCD, INCTOT, EDUCD_MOM, EDUCD_POP, INCTOT_MOM, INCTOT_POP) %>%
  rename(
    'Year'='YEAR',
    'State_ID' = 'STATEICP',
    'Household_Income' = 'HHINCOME',
    'NMothers' = 'NMOTHERS',
    'NFathers' = 'NFATHERS',
    'NChild' = 'NCHILD',
    'Hispanic' = 'HISPAN',
    'Race' = 'RACE',
    'Education' = 'EDUCD',
    'Income_Total' = 'INCTOT',
    'Education_M' = 'EDUCD_MOM',
    'Education_F' = 'EDUCD_POP',
    'Income_Total_M' = 'INCTOT_MOM',
    'Income_Total_F' = 'INCTOT_POP') %>%
  filter(Race %in% c(1:2)) %>%
  filter(Education %in% c(002, 062, 063, 064, 081, 101, 114, 116)) %>%
  filter(Income_Total %in% c(1:1184000)) %>%
  filter(Household_Income %in% c(1:2260000)) %>%
  mutate(Hispanic = factor(Hispanic, 
                      levels = c(0, 1, 2, 3, 4, 9),
                      labels = c("Not Hispanic", "Mexican", "Puerto Rican", "Cuban", "Other", "Not Reported")
                      )) %>%
  mutate(Race = factor(Race,
                      levels = c(1, 2),
                      labels = c("White", "Black/African American")
                      )) %>%
  mutate(Education = factor(Education,
                      levels = c(002, 062, 063, 064, 081, 101, 114, 116),
                      labels = c("No Schooling Completed", "High School Graduate or GED", "Regular High School Diploma", "GED or Alternative Credential", "Associate's Degree", "Bachelor's Degree", "Master's Degree", "Doctoral Degree")
                      ))

How do I account for weights? Do I need to keep some of the variables I deleted out? Or should I use tidycensus instead?



Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source