'SAS - Programatically Identifying Numeric Variables that represent Date/Time/Datetimes
I'm trying to find a way to identify the subset of numeric variables that represent dates, times or datetimes from an arbitrary dataset that contains many variables of both numeric or character type. The goal is to have a macro or template that I could run against any given dataset. I believe that the SAS formats for these variables should correctly represent what the variables are, but I'm trying to see if I can avoid having to create some code that parses the text of the SAS format for any of the applicable format names.
I was originally thinking it should be possible to do something like the following to get the SAS formats for a dataset and then do various things based on the columns that come back as date/time/datetime.
PROC SQL;
CREATE TABLE lib_X.Dataset_A AS
SELECT tbl_CL.LIBNAME
,tbl_CL.MEMNAME AS TABLE_NAME
,tbl_CL.NAME AS COLUMN_NAME
,tbl_CL.FORMAT AS SAS_Format
,tbl_CL.TYPE AS SAS_Type
,tbl_CL.varnum
,tbl_CL.LENGTH
FROM DICTIONARY.COLUMNS AS tbl_CL
WHERE tbl_CL.MEMNAME = %UPCASE("&SAS_DatasetNm")
AND tbl_CL.LIBNAME = %UPCASE("&SAS_LibNm");
QUIT;
Followed by:
DATA lib_X.Dataset_A;
SET lib_X.Dataset_A;
IF FCN_FORMAT_TYPE(SAS_Format) = "DATETIME"
THEN ...;
RUN;
Where the FCN_FORMAT_TYPE function is some function that examines the SAS format and returns which of the 4 categories of SAS Format (character, numeric, date/time, or ISO 8601) the variable is, but there doesn't seem to be an existing function that does this that I can find. Perhaps there's another approach that would work?
Currently running on SAS 9.4 M2 on a Linux server. The code will primarily be used via batch file, but possibly via Enterprise Guide as well (EG 7.1 or 6.1)
Solution 1:[1]
Dominic's answer gave me an idea about how to solve it, using the same general approach, but putting it in a PROC FCMP function instead of a macro.
PROC FCMP OUTLIB=LIB_X.FCMP_FUNCS.Format_Category_ID;
FUNCTION Format_Category(VAR_Format $, VAR_Type $) $ 16
GROUP = 'Format Category ID'
LABEL = 'Pass SAS variable type and format as CHAR and it will return what sort of variable it is.
Breaks down date/time/dt formats into independent categories.';
LENGTH FORMAT_CATEGORY $16;
SELECT;
WHEN (MISSING(VAR_Format)) FORMAT_CATEGORY = 'MISSING_FORMAT';
WHEN (VAR_Type = 'CHAR') FORMAT_CATEGORY = 'CHARACTER_FORMAT';
WHEN (PRXMATCH("/S370FZDT/", VAR_Format)) FORMAT_CATEGORY = 'NUMERIC_FORMAT';
WHEN (PRXMATCH("/8601/", VAR_Format)) FORMAT_CATEGORY = 'ISO_8601_FORMAT';
WHEN (PRXMATCH("/(DT|DATEAMPM|DATETIME)/", VAR_Format)) FORMAT_CATEGORY = 'DATETIME_FORMAT';
WHEN (PRXMATCH("/(HHMM|HOUR|MMSS|TIME|TOD)/", VAR_Format)) FORMAT_CATEGORY = 'TIME_FORMAT';
WHEN (PRXMATCH("/(DAT|DAY|YY|DOW|JUL|MON|QTR|WEEK|YEAR)/", VAR_Format)) FORMAT_CATEGORY = 'DATE_FORMAT';
WHEN (VAR_Type = 'NUM') FORMAT_CATEGORY = 'NUMERIC_FORMAT';
OTHERWISE FORMAT_CATEGORY = 'UNKNOWN';
END;
RETURN(FORMAT_CATEGORY);
ENDSUB;
It's definitely not bullet-proof, but it worked for my purposes and will hopefully be useful to others. The biggest caveat I'd add is that handling user-defined formats will be touch and go since it's simply parsing the SAS format which is passed as a character variable, so if the name overlaps with the regular expressions I'm using, all bets are off.
Solution 2:[2]
We wrote an fcmp function that does exactly this, taking ALL of the DATE / DATETIME / TIME formats from the SAS documentation.
The source is on github so you can easily add any missing formats: https://github.com/sasjs/core/blob/main/fcmp/mcf_getfmttype.sas
Docs: https://core.sasjs.io/mcf__getfmttype_8sas.html
Example macro invocation:
%put fmt_type=%sysfunc(mcf_getfmttype(TIME9.));
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | Sven the Mediocre |
| Solution 2 | Allan Bowe |
