'Using group_by to determine median of second column after sorting/filtering for a specific value in first column?
I have a huge dataset which has been difficult to work with.
I want to find the median of a second column but only based on one value in the first column. I have used this formula to find general medians without specifying/sorting by the specific values in the first column:
df%>% +group_by(column1)%>% +summarise(Median=median(colum2))
However, there is a specific value in column1 I am hoping to sort by and I only want the medians of the second column based on this first value. Would I do something similar to the below?
df%>% +group_by(column1, specificvalue)%>% +summarise(Median=median(colum2))
Is there an easier way to do this? Would it be easier to make a new dataframe based on the specific value in the first column? How would that be done so that I could have column 1 only include the specific value I want but the rest of the rows included so I can easily determine the median of column2?
Thanks!!
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|
