'Determine mean/median/IQ range of age for two separate groups
I have a dataset in Stata with variables age and carrier, an indicator for carrier of a particular disease.
Using univar age I am able to getsome descriptive statistics of age for the dataset, but now I want to compare mean/median/IQ range between carriers and non-carriers. Is there some way to do this?
I have tried one line so far:
univar age if carrier = 1
which resulted in invalid syntax error, r(198)
I had expected descriptive statistics of age when carrier is 1.
Solution 1:[1]
Sample Data
clear
set obs 100
gen age = runiformint(18,70)
gen carrier = runiformint(0,1)
Summary Stats
There are several ways to get summary statistics in Stata, but one way is to use the tabstat command:
tabstat age, by(carrier) statistics(n mean sd min p25 median p75 max iqr)
Summary for variables: age
Group variable: carrier
carrier | N Mean SD Min p25 p50 p75 Max IQR
---------+------------------------------------------------------------------------------------------
0 | 52 43.96154 16.45667 19 30 39.5 59 70 29
1 | 48 48.4375 14.24692 20 39 49 60.5 69 21.5
---------+------------------------------------------------------------------------------------------
Total | 100 46.11 15.52183 19 33 44 59.5 70 26.5
----------------------------------------------------------------------------------------------------
See help tabstat for additional statistics options.
Edited to mimic output of univar.
Solution 2:[2]
You'd have to search quite hard for univar if you had not heard of it already. It's community-contributed and dates from 1997 and 1999:
STB-51 sg67.1 . . . . . . . . . . . . . . . . . . . . . . . Update to univar
(help univar if installed) . . . . . . . . . . . . . . J. R. Gleason
9/99 pp.27--28; STB Reprints Vol 9, pp.159--161
improvements and new options to univar
STB-36 sg67 . . . . . . . . . . . . . . . Univariate summaries with boxplots
(help univar if installed) . . . . . . . . . . . . . . J. R. Gleason
3/97 pp.23--25; STB Reprints Vol 6, pp.179--183
command that offers a streamlined display of univariate summaries,
including, optionally, text-mode boxplots
Looking at its help indicates that you need its by() option. Here's a reproducible
example:
. sysuse auto, clear
(1978 automobile data)
. univar mpg, by(foreign)
-> foreign=Domestic
-------------- Quantiles --------------
Variable n Mean S.D. Min .25 Mdn .75 Max
-------------------------------------------------------------------------------
mpg 52 19.83 4.74 12.00 16.50 19.00 22.00 34.00
-------------------------------------------------------------------------------
-> foreign=Foreign
-------------- Quantiles --------------
Variable n Mean S.D. Min .25 Mdn .75 Max
-------------------------------------------------------------------------------
mpg 22 24.77 6.61 14.00 21.00 24.50 28.00 41.00
-------------------------------------------------------------------------------
Like @JR96, I recommend tabstat here.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | |
| Solution 2 | Nick Cox |
