'What are the ways to execute the code in turn for each category of data in R

Suppose there is any code,and no matter how simple and complex it is, it is important that it works with data where there is some kind of grouping variable. So as reproducible example we can see iris dataset. iris$Species has three category(setosa,virginic,versicol). Suppose i want simple action - correlation between metric variables for each species separately.I can do

dt=iris[species==virginic] `(data.table)`

and then perform cor. But let's imagine such a situation where the code is complex, huge (it doesn't even make sense to lay it out as an example) and there are many categories in the grouping variable(groupvar).

What are the ways that it works in turn for each category, i.e. first, the code is processed for groupware=1, after processing, it starts processing rows that belong to groupware=2, and so on for all categories that are in groupware, until everything passes.

on the example of the same irises

[1]setosa
some calculation code.., 
[2] virginic 
some calculation code.., ,
[3] versicol
some calculation code.., 

Because if there are thousands of categories, I can't set a filter for each one manually. What can be done to make the code work on all categories of the grouping variable separately? Thanks for your valuable help.

r


Solution 1:[1]

If I understood correctly

# tidyverse
library(tidyverse)
iris %>% 
  group_split(Species) %>% 
  map(~cor(select(.x, where(is.numeric))))
#> [[1]]
#>              Sepal.Length Sepal.Width Petal.Length Petal.Width
#> Sepal.Length    1.0000000   0.7425467    0.2671758   0.2780984
#> Sepal.Width     0.7425467   1.0000000    0.1777000   0.2327520
#> Petal.Length    0.2671758   0.1777000    1.0000000   0.3316300
#> Petal.Width     0.2780984   0.2327520    0.3316300   1.0000000
#> 
#> [[2]]
#>              Sepal.Length Sepal.Width Petal.Length Petal.Width
#> Sepal.Length    1.0000000   0.5259107    0.7540490   0.5464611
#> Sepal.Width     0.5259107   1.0000000    0.5605221   0.6639987
#> Petal.Length    0.7540490   0.5605221    1.0000000   0.7866681
#> Petal.Width     0.5464611   0.6639987    0.7866681   1.0000000
#> 
#> [[3]]
#>              Sepal.Length Sepal.Width Petal.Length Petal.Width
#> Sepal.Length    1.0000000   0.4572278    0.8642247   0.2811077
#> Sepal.Width     0.4572278   1.0000000    0.4010446   0.5377280
#> Petal.Length    0.8642247   0.4010446    1.0000000   0.3221082
#> Petal.Width     0.2811077   0.5377280    0.3221082   1.0000000

# base
COLS <- sapply(iris, is.numeric)
l <- split(iris, iris$Species)
lapply(l, function(x) cor(x[COLS]))
#> $setosa
#>              Sepal.Length Sepal.Width Petal.Length Petal.Width
#> Sepal.Length    1.0000000   0.7425467    0.2671758   0.2780984
#> Sepal.Width     0.7425467   1.0000000    0.1777000   0.2327520
#> Petal.Length    0.2671758   0.1777000    1.0000000   0.3316300
#> Petal.Width     0.2780984   0.2327520    0.3316300   1.0000000
#> 
#> $versicolor
#>              Sepal.Length Sepal.Width Petal.Length Petal.Width
#> Sepal.Length    1.0000000   0.5259107    0.7540490   0.5464611
#> Sepal.Width     0.5259107   1.0000000    0.5605221   0.6639987
#> Petal.Length    0.7540490   0.5605221    1.0000000   0.7866681
#> Petal.Width     0.5464611   0.6639987    0.7866681   1.0000000
#> 
#> $virginica
#>              Sepal.Length Sepal.Width Petal.Length Petal.Width
#> Sepal.Length    1.0000000   0.4572278    0.8642247   0.2811077
#> Sepal.Width     0.4572278   1.0000000    0.4010446   0.5377280
#> Petal.Length    0.8642247   0.4010446    1.0000000   0.3221082
#> Petal.Width     0.2811077   0.5377280    0.3221082   1.0000000

Created on 2022-03-23 by the reprex package (v2.0.1)

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Yuriy Saraykin