'Clarification on Reshaping DataFrame from LONG to WIDE in R with Gather/Spread

I am sorry to come back on a topic that has several threads on Stack, but I am trying to Reshape a dataset from LONG to WIDE, using Tidyverse, and either the Gather/Spread function as well as the pivot_wider function, and I am lost. Here it is a sample of the subset I am using for test

    structure(list(pid = structure(c(1L, 1L, 1L, 1L, 1L, 2L), .Label = c("1", 
"2", "3", "4", "5", "6", "7", "8", "9", "10", "11", "12", "13", 
"14", "15", "16", "17", "18", "19", "20", "21", "22", "23", "24", 
"25", "26", "27", "28", "29", "30", "31", "32", "33", "34", "35", 
"36", "37", "38", "39", "40", "41", "42", "43", "44", "45", "46", 
"47", "48", "49", "50", "51", "52", "53", "54", "55", "56", "57", 
"58", "59", "60", "61", "62", "63", "64", "65", "66", "67", "68", 
"69", "70", "71", "72", "73", "74", "75", "76", "77", "78", "79", 
"80", "81", "82", "83", "84", "85", "86", "87", "88", "89", "90", 
"91", "92", "93", "94", "95", "96", "97", "98", "99", "100", 
"101", "102", "103", "104", "105", "106", "107", "108", "109", 
"110", "111", "112", "113", "114", "115", "116", "117", "118", 
"119", "120", "121", "122", "123", "124", "125", "126", "127", 
"128", "129", "130", "131", "132", "133", "134", "135", "136", 
"137", "138", "139", "140", "141", "142", "143", "144", "145", 
"146", "147", "148", "149", "150", "151", "152", "153", "154", 
"155", "156", "157", "158", "159", "160", "161", "162", "163", 
"164", "165", "166", "167", "168", "169", "170", "171", "172", 
"173", "174", "175", "176", "177", "178", "179", "180", "181", 
"182", "183", "184", "185", "186", "187", "188", "189", "190", 
"191", "192", "193", "194", "195", "196", "197", "198", "199", 
"200", "201", "202", "203", "204", "205", "206", "207", "208", 
"209", "210", "211", "212", "213", "214", "215", "216"), class = "factor"), 
    timewave = structure(c(1L, 2L, 3L, 4L, 5L, 1L), .Label = c("1", 
    "2", "3", "4", "5", "6", "7", "8"), class = "factor"), dev_icd = structure(c(1L, 
    1L, 1L, 1L, 1L, 2L), .Label = c("No", "Yes"), class = "factor"), 
    lab_bnp = c(388, 199, 387.5, 318, 154, 949.4)), row.names = c(NA, 
6L), class = "data.frame")

And here there the 2 commands I came up to

test.wide2 <- test.long2 %>%
          pivot_wider(id_cols = pid, 
                      names_from = timewave, 
                      values_from = c(dev_icd, lab_bnp), 
                      names_sep = "")

Or also

test.wide <- test.long2 %>%
         group_by(pid) %>%
         gather("dev_icd", "lab_bnp", 
                key = variable, value = number ) %>%
         unite(combi, variable, timewave) %>%
         spread(combi, number)

Neither is working as I would expect, I get a lot of NA or NULL values and do not understand what is my mistake and the correct procedure. Any help not only in fixing the issue but mainly in understanding the Reshaping logic/philosophy would be greatly appreciated



Solution 1:[1]

Thanks to help from Merjin van Tiborg, I have finally nailed down the issue. The correct command to have PID on rows and columns for dev-icd and lab_bnp repeated by timewave number is as follows:

    test.wide <- hf.longsmall %>%
  pivot_wider(id_cols = c("pid", "timewave"), 
              names_from = timewave, 
              values_from = c(dev_icd, lab_bnp), 
              names_sep = "_t")

That is equivalent to the following

    test.wide1  <- hf.longsmall %>% 
  group_by(pid, timewave) %>%
  mutate(row = row_number()) %>%
  tidyr::pivot_wider(names_from = timewave, 
                     values_from = c(dev_icd, lab_bnp), 
                     names_sep = "_t") %>%
  select(-row)

I have got the following warning --> "Values in values_from are not uniquely identified; output will contain list-cols", and that was due to a real (and dangerous) mistake of duplicated PID during data entry. Anyhow, I could understand the issue only using the group_by option reported above.

Thank you all for the patience

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Diego