collapsing variables using an array

Here’s a macro that I generated which collapses variables into a single variable using an array:

* ----------------------------;
* Create collapsed variables. ;
%macro collapse (vars= , newvar=, dsetin=, dset=, arraynm= , i=);
data &dset. (keep = key &newvar.); set sasdata.&dsetin.;
array &arraynm. (&i.)
&vars.;
do i = 1 to &i.;
if &arraynm(i) = 1 then &newvar. = i;
output;
end;
run;
proc sort data=&dset. nodupkey;
by key &newvar;
run;
* Generate N using last. to maintain the original number of observations to calculate percentages;  (OS selections were not mutually exclusive.) ;
data sasdata.&dset; set &dset.;
by key;
if last.key then N = 1;
run;
proc freq data=sasdata.&dset.;
tables N &newvar.;
run;
proc print data=sasdata.&dset.;
run;
%mend;

Here’s an invocation of the macro:

* --------------------------------------------------;
* Create collapsed OS variable for graphing purposes;
%collapse(vars= nwinxp  nwin2000 nmacosx nmacos9 nlinux notheros, newvar=os, dsetin=support, dset=os, arraynm=opersys, i=6);

using an array to create new variables

An example of using an array to create new variables, in this case character to numeric:

* ------------------------------------------------------------------;
* Create numeric variables for scale questions for graphing purposes;
array scale (3)
rateinitrestime
rateresolutiontm
rateres;
array nscale (3)
nrateinitrestime
nrateresolutiontm
nrateres;
do i= 1 to 3;
if scale(i) = 'Excellent' then nscale(i) = 1;
if scale(i) = 'Good' then nscale(i) = 2;
if scale(i) = 'Satisfactory' then nscale(i) = 3;
if scale(i) = 'Fair' then nscale(i) = 4;
if scale(i) = 'Poor' then nscale(i) = 5;
if scale(i) = ' ' then nscale(i) = .;
end;
run;

picture format notes

An email I just wrote to a colleague about picture formats:
There’s something tricky about picture formats. No matter what digit selector you specify, by which I mean a 9 or a 0, SAS will display the digits to the right of the decimal place. Unfortunately it’s the kind of thing where you have to know your data. If we had both really small and really large numbers I’d have to write more complex arguments for my proc format. It’s not “data driven” or anything like that.
When you use a non-zero digit selector such as 9, it simply becomes a placeholder. If there isn’t a digit present in the data value, SAS will print a zero. However, if you use a zero SAS will only print a digit if there actually is a digit in the given location.
If I had said

picture pctpic
low-high='0,099.9%';

instead of

picture pctpic
low-high='0,009.9%';

I would have seen 04.1% instead of 4.1% .

character function examples

********************************************************************;
* read in syslog files ;
* ------------------------------------------------------------------;
data mail (drop=id id2 id3 rectype spamscore) spam (keep=squid spamscore); infile r3 pad missover dlm=' ,';
length  mon $3.
day 3
time $8.
host $30.
rectype $6.
id $2.
id2 $2.
id3 $2.
squid $18. * SQuID = SendMail Queue ID # ;
to $40;    * Destination email address   ;
input mon day time host rectype @;
if rectype="sm-mta" then do;
* --------------------------------------------------;
* $char informat reads in entire string, including  ;
* blanks and punctuation.                           ;
* --------------------------------------------------;
input id id2 id3 squid @ 'to=' to $char50.;
* --------------------------------------------------;
* functions on squid effectively remove a trailing  ;
* colon (e.g. k3I400V6014162:)                      ;
* --------------------------------------------------;
squid=substr(squid,1,(indexc(squid,':')-1)) ;
* --------------------------------------------------;
* make 'to' lowercase, remove any blanks, and stop  ;
* reading the line when encountering a comma,       ;
* keeping only the first forwarding destination.    ;
* (e.g. to=First Last ,fl@user.edu) ;
* --------------------------------------------------;
to=lowcase(compress(scan(to,1,',')));
* --------------------------------------------------;
* extract email addresses that are enclosed in      ;
* brackets (e.g to=)              ;
* --------------------------------------------------;
if index(to,' 0 then
to=scan(substr(to,index(to,'');
* --------------------------------------------------;
* extract the information to the right of the colon ;
* e.g. to=        ;
* --------------------------------------------------;
if index(to,':') > 0 then
to=scan(substr(to,index(to,':')+1),1,'>');
output mail; /* legimate email */
end;
else if rectype="mimede" then do;
input @ 'MDLOG,' squid
@ 'spammish,' spamscore;
output spam;
end;
run;