Potential employers frequently ask you questions to gauge your comfort level with SAS when you apply for a job that uses these tools. In this article, we provide the top 20+ SAS interview questions and answers so you can prepare for your next interview.
SAS Interview Questions and Answers for Freshers
1. What is SAS?
A software suite called SAS (Statistical Analysis System) is used for predictive modeling, data management, corporate intelligence, and advanced analytics.
2. What are the components of SAS?
SAS consists of Base SAS, SAS/STAT, SAS/GRAPH, SAS/SQL, SAS/ETS, SAS/IML, SAS/ACCESS, and more.
3. Differentiate between ‘PROC MEANS’ and ‘PROC SUMMARY’.
Both are employed in data summarization. While ‘PROC SUMMARY’ generates a dataset including summary statistics, ‘PROC MEANS’ shows the summary statistics, such as mean, median, standard deviation, etc.
4. What is the difference between ‘IN=’ and ‘WHERE’ in the SAS data step?
Both are employed in data summarization. While ‘PROC SUMMARY’ generates a dataset including summary statistics, ‘PROC MEANS’ shows the summary statistics, such as mean, median, standard deviation, etc.
5. Explain the difference between ‘DROP’ and ‘KEEP’ statements.
The ‘KEEP’ command only keeps specific variables in the output dataset; the ‘DROP’ phrase eliminates variables from the output dataset.
6. Why do you use the ‘RETAIN’ statement in SAS?
A variable’s value can be carried over from one data step iteration to the next using the “RETAIN” statement.
7. What is the difference between ‘FORMAT’ and ‘INFORMAT’ in SAS?
Whereas ‘INFORMAT’ instructs SAS on how to read data into variables, ‘FORMAT’ controls how variables look in output datasets, reports, and graphs.
8. Explain the concept of the ‘BY’ statement in SAS.
The ‘BY’ phrase is utilized to indicate SAS’s data processing strategy for datasets containing multiple observations. ‘PROC SORT’ and data steps are frequently used in tandem with it to process data by groups.
9. What is macro in SAS?
Macros in SAS are code segments that, like functions in other programming languages, are called upon. They are employed to parameterize programs, automate tedious operations, and provide reusable code.
10. How do you debug SAS code?
Among the debugging techniques offered by SAS are the following: ‘PUT’ statement for displaying variable values; ‘OPTIONS MPRINT’ for displaying macro code; ‘PROC PRINT’ for examining datasets at different stages; and analyzing log files for error warnings.
11. Explain the difference between ‘PROC SQL’ and ‘DATA Step’ in SAS.
‘DATA Step’ is used for data processing and manipulation utilizing SAS programming commands, whereas ‘PROC SQL’ is used for SQL syntax queries and manipulation.
12. How do you handle missing values in SAS?
Functions like ‘IFN’ and ‘COALESCE’ can be used to handle missing data, or ‘PROC MEANS’ or ‘PROC SUMMARY’ with the ‘MISSING’ option can be used to include missing values in calculations.
13. What are the implicit and explicit data conversions in SAS?
When multiple variable types are used together, SAS automatically converts implicit data; in contrast, explicit data conversion is done manually using functions like “INPUT,” “PUT,” and “PUTN.”
14. What is the significance of the ‘ODS’ statement in SAS?
To control output in SAS, utilize the “ODS” (Output Delivery System) statement to route output to various destinations, like HTML, PDF, Excel, and so on.
15. What is the way to create a new variable in SAS?
Assignment statements and functions such as ‘IF’, ‘CASE’, ‘CALL SYMPUT’, etc. can be used to generate new variables during the ‘DATA’ step.
16. What are SAS formats and informats?
While informats tell SAS how to read data into variables from external sources, SAS formats regulate how data values appear in the output.
17. What is the use of the ‘LENGTH’ statement in SAS?
In SAS, a variable’s length is specified using the ‘LENGTH’ statement. The main application for it is with character variables.
18. How do you handle duplicate records in SAS?
Procedures such as ‘PROC SORT’ with ‘NODUPKEY’ or ‘NODUP’ options, or ‘PROC SQL’ with the ‘DISTINCT’ keyword can be used to deal with duplicate records.
19. Explain the concept of indexing in SAS.
In SAS, indexing is used to create variable-based indexes that speed up data retrieval. It facilitates quicker data access while merging or searching across datasets.
SAS Interview Questions and Answers for Experienced Professionals
20. How do you include or exclude specific variables in a data set?
DROP, KEEP Statements, and Data Set Options
DROP, KEEP Statement
The variables whose names you wish to delete from the data collection are specified in the DROP statement.
data readin1;
set readin;
drop score;
run;
The variables that you wish to keep from the data set are listed in the KEEP statement.
data readin1;
set readin;
keep var1;
run;
DROP, KEEP Data set Options
The inability to utilize the DROP/KEEP statement in processes is the primary distinction between it and the DROP=/KEEP=data set option.
data readin1 (drop=score);
set readin;
run;
data readin1 (keep=var1);
set readin;
run;
21. How can you print a data set’s observations 5 through 10?
SAS is instructed to print observations 5 through 10 from the data set READIN by using the FIRSTOBS= and OBS=data set parameters.
proc print data = readin (firstobs=5 obs=10);
run;
22. Differentiate INFILE and INPUT
The INPUT statement is used to specify your variables, while the INFILE statement is used to identify an external file.
FILENAME TEST ‘C:\DEEP\File1.xls’;
DATA READIN;
INFILE TEST;
LENGTH NAME $25;
INPUT ID NAME$ GENDER;
RUN;
The variable type is identified as a character by the variable name followed by the dollar sign ($). Name is a character variable and ID and GENDER are numeric variables in the example above.
23. Differentiate Missover and Truncover in SAS
Missover: The INPUT statement does not advance to the next line when reading a brief line when the MISSOVER option is applied to the INFILE statement. MISSOVER sets variables to missing instead.
Truncover: When a value is less than the length that the ‘INPUT’ statement expects, it nevertheless assigns the raw data value to the variable.
Missover – Example: An example of an external file containing data is the one that follows:
1
22
333
4444
In this DATA phase, values are assigned to the variable by reading a single field from each raw data record using the numeric informat.
data readin;
infile ‘external-file’ missover;
input ID4.;
run;
proc print data=readin;
run;
Output
Obs ID
1 .
2 .
3 .
4 4444
Truncover – Example
data readin;
infile ‘external-file’ truncover;
input ID4.;
run;
proc print data=readin;
Run;
Output
Obs ID
1 1
2 22
3 333
4 4444
24. What is the code for generating a data collection with 100 observations, 0 mean, and 1 standard deviation?
data readin;
do i=1 to 100;
temp=0 + rannor(1) * 1;
output;
end;
run;
proc means data=readin mean stddev;
var temp;
run;
25. How should values be labeled and used in PROC FREQ?
To configure a format, use PROC FORMAT.
proc format;
value score 0 – 100=‘100-‘
101 – 200=‘101+’
other=‘others’
;
proc freq data=readin;
tables outdata;
format outdatascore. ;
run;
26. How can you recode a group of variables using arrays?
Q1, Q2, Q3, Q20 should all be recoded in the same manner. Recode the variable to SAS missing if its value is 6.
data readin;
set outdata;
array Q(20) Q1-Q20;
do i=1 to 20;
if Q(i)=6 then Q(i)=.;
end;
run;
Conclusion
You might come across the above list of the top 30+ SAS interview questions and answers throughout your interviews. Gain expertise with in-depth SAS skills by enrolling in our SAS training in Chennai.