'Code acting differently inside of a function in R

i've got this set of code here in R that seperates a dataframe containing tweets by their day posted. I'm finding a weird interaction where, if i was to run the code outside of the function, it works perfectly fine. However, running it within the function only outputs a single row into the output frame, whereas there should be many more. This has me stumped as i cannot find anything else about this.

function(dataFrame, output){
  dates <- c("2022-04-15","2022-04-16","2022-04-17",
              "2022-04-18","2022-04-19","2022-04-20")
  t <- nrow(dataFrame)
  for(i in 1:t){
    currentchar <- toString(dataFrame$name[i])
    print(currentchar)
    currentFrame <- get(toString(dataFrame$name[i]))
    counts <- c(0,0,0,0,0,0)
    
    frameLength <- nrow(currentFrame)
    for(j in 1:frameLength){
      time <- currentFrame$created_at[j]
      date <- substr(time, 1, 10)
      print(date)
      print(j)
      num <- switch(date,
                     "2022-04-20" = (counts[6] <- counts[6] + 1),
                     "2022-04-19" = (counts[5] <- counts[5] + 1),
                     "2022-04-18" = (counts[4] <- counts[4] + 1),
                     "2022-04-17" = (counts[3] <- counts[3] + 1),
                     "2022-04-16" = (counts[2] <- counts[2] + 1),
                     "2022-04-15" = (counts[1] <- counts[1] + 1))
    }
    
    for(k in 1:6){
      f <- data.frame(toString(characters_list[i]), dates[k], counts[k])
      colnames(f) <- c('name', 'date', 'count')
      output <<- rbind(f, output)
    }
  }
}

When ran outside of the function, the code produces the result i want, but when ran within it produces a single row of the dataframe. Any help is appreciated, thanks.



Solution 1:[1]

It is because of scoping. Everything inside the function gets deleted when you get out of the function.

Your function doesn't return any value. But the returned value of the function is that what stays after the function call.

# assign the function to a name
myfunction <- function(dataFrame, output){
  dates <- c("2022-04-15","2022-04-16","2022-04-17",
              "2022-04-18","2022-04-19","2022-04-20")
  t <- nrow(dataFrame)
  for(i in 1:t){
    currentchar <- toString(dataFrame$name[i])
    print(currentchar)
    currentFrame <- get(toString(dataFrame$name[i]))
    counts <- c(0,0,0,0,0,0)
    
    frameLength <- nrow(currentFrame)
    for(j in 1:frameLength){
      time <- currentFrame$created_at[j]
      date <- substr(time, 1, 10)
      print(date)
      print(j)
      num <- switch(date,
                     "2022-04-20" = (counts[6] <- counts[6] + 1),
                     "2022-04-19" = (counts[5] <- counts[5] + 1),
                     "2022-04-18" = (counts[4] <- counts[4] + 1),
                     "2022-04-17" = (counts[3] <- counts[3] + 1),
                     "2022-04-16" = (counts[2] <- counts[2] + 1),
                     "2022-04-15" = (counts[1] <- counts[1] + 1))
    }
    
    for(k in 1:6){
      f <- data.frame(toString(characters_list[i]), dates[k], counts[k])
      colnames(f) <- c('name', 'date', 'count')
      output <<- rbind(f, output)
    }
  }
  # return `output` as the result
  output
}

# now, you can call and assign the function's result at the same time:
result <- myfunction(dataFrame, output)

# and `result` is what your `f` should be at the end of the calculation.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Gwang-Jin Kim