'SAS proc freq of multiple tables into a single one
I have the following dataset in SAS;
City grade1 grade2 grade3
NY A. A. A
CA. B. A. C
CO. A. B. B
I would "combine" the three variables grades and get a proc freq that tells me the number of grades for each City; the expected output should therefore be:
A. B. C
NY 3. 0. 0
CA. 1. 1. 1
CO. 1. 2. 0
How could I do that in SAS?
Solution 1:[1]
Quite a few steps but it gives the expected result.
*-- Creating sample data --*;
data have;
infile datalines delimiter="|";
input City $ grade1 $ grade2 $ grade3 $;
datalines;
NY|A|A|A
CA|B|A|C
CO|A|B|B
;
*-- Sorting in order to use the transpose procedure --*;
proc sort data=have; by city; quit;
*-- Transposing from wide to tall format --*;
proc transpose data=have out=stage1(rename=(col1=grade) drop= _name_);
by city;
var grade:;
run;
*-- Assigning a value to 1 for each record for later sum --*;
data stage2;
set stage1;
val = 1;
run;
*-- Tabulate to create val_sum --*;
ods exclude all; *turn off default tabulate print;
proc tabulate data=stage2 out=stage3;
class city grade;
var val;
table city,grade*sum=''*val='';
run;
ods select all; *turn on;
*-- Transpose back using val_sum --*;
proc transpose data=stage3 out=stage4(drop=_name_);
by city;
id grade;
var val_sum;
run;
*-- Replace missing values by 0 to achieve desired output --*;
proc stdize data=stage4 out=want reponly missing=0;run;
City A B C
CA 1 1 1
CO 1 2 0
NY 3 0 0
Solution 2:[2]
In general:
- Transpose data to a long format
- Use PROC FREQ with the SPARSE option to generate the counts
- Save the output from PROC FREQ to a data set
- Transpose the output from PROC FREQ to the desired output format
*create sample data;
data have;
input City $ grade1 $ grade2 $ grade3 $;
cards;
NY A A A
CA B A C
CO A B B
;;;;
*sort;
proc sort data=have; by City;run;
*transpose to long format;
proc transpose data=have out=want1 prefix=Grade;
by City;
var grade1-grade3;
run;
*displayed output and counts;
proc freq data=want1;
table City*Grade1 / sparse out=freq norow nopercent nocol;
run;
*output table in desird format;
proc transpose data=freq out=want2;
by city;
id Grade1;
var count;
run;
Solution 3:[3]
Here is a way to do it in two steps: a sort step and a data step.
proc sort data=have; by city; run;
data count (drop grade1-grade3);
set have;
* create an array of all your grades;
array grade(3) 3 grade1-grade3;
by city;
*set the count to zero for each city;
if first.city then do;
A = 0;
B = 0;
C = 0;
end;
* use a do loop to count the grades;
do i = 1 to 3;
if grade(i) = 'A' then A + 1;
else if grade(i) = 'B' then B + 1;
else if grade(i) = 'C' then C + 1;
end;
run;
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | Tom |
| Solution 2 | Reeza |
| Solution 3 | Laura Daniel |
