'How in R can I combine a list and a data.frame with different size and lengths?

library(dplyr)
library(rvest)

link1 <- "https://somon.tj/adv/7985721_2-komn-dom-grandzavod/"
link2 <- "https://somon.tj/adv/7866644_5-komn-kvartira-3-etazh-79-m2-a-sino/"

house_link <- c(link1, link2)

house_features = lapply(houselink, function(link) {
  page_data <- tryCatch(read_html(link), error = function(e) e, warning = function(w) w)
  
  if(!inherits(page_data, "error")) {
    data.frame(
      link = link,
      parameters = page_data %>% html_nodes(".label") %>% html_text(trim = TRUE),
      values = page_data %>% html_nodes(".info") %>% html_text(trim = TRUE)
    )
 
  } else {
    NULL
  }
})

do.call(rbind, house_features) %>% 
  group_by(link, parameters) %>%
  mutate(parameters = if_else(row_number() > 1, paste(parameters,row_number()), parameters)) %>% 
  pivot_wider(id_cols = link, names_from = parameters, values_from = values)

But when I add one more variable called pricing, the code produces an error. I am new in R to troubleshoot this((

if(!inherits(page_data, "error")) {
    data.frame(
      link = link,
      parameters = page_data %>% html_nodes(".label") %>% html_text(trim = TRUE),
      values = page_data %>% html_nodes(".info") %>% html_text(trim = TRUE)
    )
    list(
      pricing = page_data %>% html_nodes("h1") %>% html_text(trim = T)
    )

I thought maybe one way to do this is to create link2 = link in the list code. And later on, connect it by using link2 as an identifier.



Solution 1:[1]

Thanks to the community here, the problem was solved. In short pricing variable should have been added into data.frame and later on added in pivot_wilder, a function which I explored just a while.

house_link <- c(link1, link2)
house_features = lapply(houselink, function(link) {
  page_data <- tryCatch(read_html(link), error = function(e) e, warning = function(w) w)
  
  if(!inherits(page_data, "error")) {
    data.frame(
      link = link,
      parameters = page_data %>% html_nodes(".label") %>% html_text(trim = TRUE),
      values = page_data %>% html_nodes(".info") %>% html_text(trim = TRUE),

      pricing = page_data %>% html_nodes("h1") %>% html_text(trim = T)
    )
 
  } else {
    NULL
  }
})

do.call(rbind, house_features) %>% 
  group_by(link, parameters) %>%
  mutate(parameters = if_else(row_number() > 1, paste(parameters,row_number()), parameters)) %>% 
  pivot_wider(id_cols = link, names_from = parameters, values_from = values)

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Akbar Ato