'R - How to create new data frame based on matching rows [closed]

I currently have a dataframe which looks like this:

I am trying to make a new dataframe so I only have one row for each country and so I can then add a new column for percentage change between 1990 and 2019. I am very new to R, any help or hints would be appreciated.

I used the filter command

dataframe <- dataframe %>% filter(Year == "1990" | Year == "2019")

to remove all years inbetween, also tried using the diff(dataframe$percentagecolumn) which uses the default lag of 1 which finds the difference between two rows but doesn't create a new dataframe where there is only one row per country.



Solution 1:[1]

Without a reproducible example that's a difficult task to perform. I used the gapminder data to try to solve your problem:

Data

gapminder %>% select(country, year, lifeExp) %>% 
  filter(year %in% c(1952, 1977)) %>%
  pivot_wider(names_from = year, values_from = lifeExp) %>%
  mutate(difference = abs(`1952`- `1977`))

Output

# A tibble: 142 x 4
   country     `1952` `1977` difference
   <fct>        <dbl>  <dbl>      <dbl>
 1 Afghanistan   28.8   38.4       9.64
 2 Albania       55.2   68.9      13.7 
 3 Algeria       43.1   58.0      14.9 
 4 Angola        30.0   39.5       9.47
 5 Argentina     62.5   68.5       6.00
 6 Australia     69.1   73.5       4.37
 7 Austria       66.8   72.2       5.37
 8 Bahrain       50.9   65.6      14.7 
 9 Bangladesh    37.5   46.9       9.44
10 Belgium       68     72.8       4.8 
# ... with 132 more rows

Translated to your (presumed) dataframe that might be:

dataframe %>% select(Year, Country, percentagecolumn) %>% 
  filter(year %in% c(1990, 2019)) %>% 
  pivot_wider(names_from = Year, values_from = percentagecolumn) %>% 
  mutate(percentage_difference = abs(`1990`- `2019`))

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Gnueghoidune