'generate and ID based in one condition
I have 100 pdf medical reports of different persons, I included each report into a list in R, they have two columns with a lot of different information each one, but I just want the reports that have the gallbladder tissue, so I want to create an ID for the all report nut only the rows that contain the word "gallbladder". Then I want to filter only the gallbladder reports to extract further information. These is how it looks each element of the list (They have much more information)
list[[1]]
report text text_2
1 name andres
1 tissue gallbladder
1 rut 11455698
list[[2]]
report text text_2
2 name ana
2 tissue liver
2 rut 5556678
I want to create the ID according to tissue : gallbladder
list[[1]]
report text text_2 ID
1 name andres 1
1 tissue gallbladder 1
1 rut 11455698 1
list[[2]]
report text text_2 ID
2 name ana 0
2 tissue liver 0
2 rut 5556678 0
then i want to filter only the reports that the ID==1
I tried many ways but i just have the ID for the row, not for the all report.
list[[1]]
report text text_2 ID
1 name andres 0
1 tissue gallbladder 1
1 rut 11455698 0
list[[2]]
report text text_2 ID
2 name ana 0
2 tissue liver 0
2 rut 5556678 0
Maybe you have some ideas! Thank you!
Solution 1:[1]
We may loop over the list with lapply, then create the ID, column by checking if there are any value in 'text_2' column as "gallbladder" - any ensure to return a single TRUE/FALSE which gets recycled for the entire data in the list and this logical column is coerced to binary with as.integer or just +
list2 <- lapply(list, function(x)
transform(x, ID = +(any(text_2 == "gallbladder"))))
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | akrun |
