SAS procedure IF THEN ELSE loop did not recognize the Roman letters or numbers as value

I have written the following SAS procedure to display Grades based on the salary:

Not Working: This displays only 'I' for all observations irrespective of the else condition that says any salary less than 35000 should have a Grade as 'II'

DATA Test (Keep = FirstName LastName Salary Grade);
set orion.sales;
if salary  35000 then grade = 'I';    /* Did not interpret 'I' */
else grade = 'II';    /* Did not interpret 'II' */
run;

proc print data = test;
run;

Working:

DATA Test (Keep = FirstName LastName Salary Grade);
set orion.sales;
if salary  35000 then grade = 'G1';    /* interprets 'G1' */
else grade = 'G2';    /* interprets 'G2' */
run;

proc print data = test;
run;

So basically, the SAS did not interpret the Roman letters or numbers. Does anyone has an idea of why this is happening or any workaround to display the Roman numbers or letters?

Topic sas dataset

Category Data Science


SAS initializes the length and type of the variable, I.e. creates and assigns a position for it in the program data vector, at the first reference to the variable it sees.

If (condition) then output='I'; defines it as length 1.

Workarounds

  1. Your discovered solution of having the same lengths.
  2. The first reference to the variable being LENGTH output $2;
  3. Assigning the longest length first.
  4. The first reference to the variable being If 0 then output='zz';

You can confirm the issue, which also pops up when concatenating datasets, by looking at variable properties with proc contents or proc datasets.

From the documentation:

Note: When you create character variables, SAS determines the length of the variable from its first occurrence in the DATA step. Therefore, you must allow for the longest possible value in the first statement that mentions the variable. If you do not assign the longest value the first time the variable is assigned, then data can be truncated.

http://support.sas.com/documentation/cdl/en/basess/58133/HTML/default/viewer.htm#a001336069.htm

Sample code:

data a;
    input firstname $ lastname $ salary;
    cards;
Billy Bob 20000
Bob Graham 50000
Graham Alice 30000
Alice Graham 40000
;

proc print data=a; run;

*Did not work;
DATA test;
set a;
if salary < 35000 then grade = 'I';
else grade = 'II';
run; proc print data = test; run;

*Worked;
DATA test;
length grade $ 2;
set a;
if salary<35000 then grade = 'I';
else grade = 'II';
run; proc print data = test; run;

*Worked;
DATA test;
set a;
if salary>=35000 then grade = 'II';
else grade = 'I';
run; proc print data = test; run;

*Worked;
DATA test;
if 0 then grade='ZZ';
set a;
if salary<35000 then grade = 'I';
else grade = 'II';
run; proc print data = test; run;

Also, as I stated, you can run PROC CONTENTS DATA=TEST; RUN; after each data step to confirm the data length of each column.

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.