'r arithmetic on separate dataframes with row titles

I imported some census data from 2 different years and did some clean up. Not all of the variables are numeric but I want to do some arithmetic (difference) then save the result into a new dataframe. Here is some sample data.

df1 <- data.frame(country = c("Brazil", "Columbia", "Hong Kong"), ward_1 = c(35,25,33), ward_2 = c(22,33,44), continent = c("South America", "South America", "Asia"))
df2 <- data.frame(country = c("Brazil", "Hong Kong", "Columbia"), ward_1 = c(45,62,26), ward_2 = c(77,55,67), continent = c("South America", "Asia", "South America"))

Is there a function that can match the entries in the first column then do the arithmetic to columns 2 and 3? Or should the first column be sorted alphabetically before doing any arithmetic?

How do you do arithmetic when there are non-numeric variables in the dataframe?

r


Solution 1:[1]

Start by matching the countries on both data.frames and then subtract based on that index.

i <- match(df1$country, df2$country)
df1$ward_1 - df2$ward_1[i]
#> [1] -10  -1 -29
df1$ward_2 - df2$ward_2[i]
#> [1] -55 -34 -11

Created on 2022-03-01 by the reprex package (v2.0.1)

To automate the differences, Map or mapply can process several data sets at once.

Map(\(x, y) x - y, df1[2:3], df2[i, 2:3])
#> $ward_1
#> [1] -10  -1 -29
#> 
#> $ward_2
#> [1] -55 -34 -11

Created on 2022-03-01 by the reprex package (v2.0.1)

And this can be coerced to data.frame.

as.data.frame(Map(\(x, y) x - y, df1[2:3], df2[i, 2:3]))
#>   ward_1 ward_2
#> 1    -10    -55
#> 2     -1    -34
#> 3    -29    -11

Created on 2022-03-01 by the reprex package (v2.0.1)

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Rui Barradas