'How to group by data in one column into multiple columns keeping rows
| Identifier | Value |
|---|---|
| 511016 | 75.72911 |
| 511016 | 79.01783 |
| 511016 | 74.87570 |
| 511029 | 72.75873 |
| 511029 | 74.41798 |
| 511029 | 78.56112 |
The dataset consists of two columns (as above) but 77,000 rows. The first column is the 'name' vector and the second the value. I need to transform the data so that the first column has just one value for the identifier and after this, the columns take all the values that the respective identifier has. Like this:
| Identifier | Value 1 | Value 2 | etc... |
|---|---|---|---|
| 511016 | 75.72911 | 79.01783 | |
| 511029 | 72.75873 | 74.41798 |
I have been able to use group_keys to produce a single column with the identifiers but cannot get the values to match in this way. Any help is appreciated.
Solution 1:[1]
This should do it:
library(dplyr)
#>
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#>
#> filter, lag
#> The following objects are masked from 'package:base':
#>
#> intersect, setdiff, setequal, union
library(tidyr)
dat <- structure(list(Identifier = c(511016L, 511016L, 511016L, 511029L,
511029L, 511029L),
Value = c(75.72911, 79.01783, 74.8757, 72.75873,
74.41798, 78.56112)), row.names = c(NA, 6L), class = "data.frame")
dat %>%
group_by(Identifier) %>%
mutate(obs = row_number()) %>%
pivot_wider(names_from = "obs",
values_from = "Value",
names_prefix="value")
#> # A tibble: 2 × 4
#> # Groups: Identifier [2]
#> Identifier value1 value2 value3
#> <int> <dbl> <dbl> <dbl>
#> 1 511016 75.7 79.0 74.9
#> 2 511029 72.8 74.4 78.6
Created on 2022-04-29 by the reprex package (v2.0.1)
Solution 2:[2]
library(dplyr)
library(tidyr)
df <- tribble(~Identifier, ~Value,
511016, 75.72911,
511016, 79.01783,
511016, 74.87570,
511029, 72.75873,
511029, 74.41798,
511029, 78.56112
)
df %>%
group_by(Identifier) %>%
mutate(row_id = row_number()) %>%
pivot_wider(id_cols = "Identifier", names_from = row_id,
values_from = Value,
names_glue = "{.value}{row_id}") %>%
ungroup()
Solution 3:[3]
Assuming your df has two columns, One similar Option is:
library(dplyr)
library(tidyr)
df %>%
group_by(Identifier) %>%
mutate(name = paste(colnames(df[2]), row_number())) %>%
pivot_wider(names_from = name, values_from = Value)
Identifier `Value 1` `Value 2` `Value 3`
<int> <dbl> <dbl> <dbl>
1 511016 75.7 79.0 74.9
2 511029 72.8 74.4 78.6
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | DaveArmstrong |
| Solution 2 | |
| Solution 3 | TarJae |
