An email I sent to a colleague describing how to use the program below:
Attached is a program which help you look for inconsistencies in how the character data was entered (i.e. Not applicable vs N/A vs Not Applicable) and so on. Another thing that it will do is that it will let you know how many records are missing values if you wanted to do a global N/A or something of that nature.
The program will work on any SAS dataset. It automatically runs a frequency of values on every character variable (string) in a given dataset. Some of the long more narrative text fields you’ll likely just want to ignore. (When it comes time for analysis someone will have a lot of fun reading those long explanations and coming up with ways to code them. Did you ever have the pleasure of studying content analysis or text mining?)
![]()
The program creates an individual report per dataset but I bound them into a single PDF to make it easier to send to you.
Please let me know if you have any questions. The most important things to remember are:
- DO NOT EDIT THE MACRO IF YOU ARE NOT 100% SURE WHAT TO DO
- Keep an original copy of the program to go back to on the off chance that a stray keystroke breaks something.
All you need to do to get this program to work is to assign a library (i.e. point SAS to the datasets) and tell SAS what directory to dump your PDFs into. For convenience that is set with a global macro variable at the top. (I try to set parameters above the bulk of the code to help maintain the integrity of the actual program.)
The last section of the program executes the macro against a given dataset. You can comment out a single line with an asterisk at the start of the line or separate out a block using /* and */.
Note: This program was developed on a 32-bit Windows system.
*------------------------------------------------------------;
* Set options and parameters ;
*------------------------------------------------------------;
options mprint symbolgen;
options nonumber nodate nocenter nosource spool;
*------------------------------------------------------------;
* Path to output ;
*------------------------------------------------------------;
%let pdfDest=S:FEBSTATReportsMisc Reports;
*------------------------------------------------------------;
* Set library ;
*------------------------------------------------------------;
libname febstat "S:FEBSTATDatasets";
*------------------------------------------------------------;
* The macro (essentially a subroutine) - DO NOT MODIFY ;
* The "dset" variable is where you assign the dataset you ;
* wish to process. ;
*------------------------------------------------------------;
%macro charFreq(dset=);
dm 'odsresults; cancel';
proc datasets library = work nolist;
modify &dset.;
attrib _all_ label='';
quit;
proc contents data=&dset. varnum out=foo
(where=(type=2 and name not in('IDNUM','SUBINIT'))
keep=name type varnum) noprint;
run;
proc sort data=foo;
by varnum;
run;
data _null_;
retain N 0;
set foo(keep=name) end=last;
call symput("name"||trim(left(_N_)),compress(NAME));
if last then call symput('N',trim(left(_N_)));
run;
ods listing close;
ods pdf body="&pdfDest.&dset._charFreq.pdf"
style=sasweb
startpage=yes;
%do printLoop = 1 %to &N.;
ods proclabel="&&name&printLoop";
proc freq data=&dset.;
tables &&name&printLoop / list out=&&name&printLoop nocum nopercent;
title &&name&printLoop (Unique Values);
run;
proc datasets lib=work nolist;
delete &&name&printLoop;
quit;
%end;
ods pdf close;
ods listing;
title;
%mend charFreq;
*------------------------------------------------------------;
* Execute program;
*------------------------------------------------------------;
%charFreq(dset=LIBNAME.FOO);
0 Comments.