##### QUESTION 1:

A financial services manager wants to assess the probability that certain clients will default on their Home Equity Line of Credit (HELOC).

A former employee left the code listed below.

The training data set is named HELOC, while a similar data set of more recent clients is named RECENT_HELOC.

Which SAS data steps will calculate the predicted probability of default on recent clients? (Choose two.)

A. Option A
B. Option B
C. Option C
D. Option D

##### QUESTION 2:

An analyst generates a model using the LOGISTIC procedure. They are now interested in getting the sensitivity and specificity statistics on a validation data set for a variety of cutoff values.

Which statement and option combination will generate these statistics?

A. Score data=valid1 out=roc;
B. Score data=valid1 outroc=roc;
C. mode1 resp(event= \’1\’) = gender region/outroc=roc;
D. mode1 resp(event”1″) = gender region/ out=roc;

##### QUESTION 3:

The selection criterion used in the forward selection method in the GLMSELECT procedure is:

A. RSQ
B. MSE
C. R-squared
D. AIC

##### QUESTION 4:

When mean imputation is performed on data after the data is partitioned for an honest assessment, what is the most appropriate method for handling the mean imputation?

A. The sample means from the validation data set are applied to the training and test data sets.
B. The sample means from the training data set are applied to the validation and test data sets.
C. The sample means from the test data set are applied to the training and validation data sets.
D. The sample means from each partition of the data are applied to their own partition.

##### QUESTION 5:

This question will ask you to provide a missing option.

Complete the following syntax to test the homogeneity of variance assumption in the GLM procedure:

means Region / =levene ;

A. test
C. var
D. hovtest

##### QUESTION 6:

Refer to the exhibit:

Which statement is true, based on the plots above?

A. Approximately twice as many customers with the top ten percent of predicted probabilities are expected to have a positive versus negative event.

B. Approximately ten percent of a randomly selected subset of twenty percent of the customers are expected to have a positive event.

C. Approximately twenty percent of the customers with a predicted score of 3 have a positive predicted class.

D. Approximately ten percent of those customers with the top twenty percent of predicted probabilities are expected to have a positive event.

##### QUESTION 7:

This question will ask you to provide a missing option.

A business analyst is investigating the differences in sales figures across 8 sales regions. The analyst is interested in viewing the regression equation parameter estimates for each of the design variables.

Which option completes the program to produce the regression equation parameter estimates?

A. Solve
B. Estimate
C. Solution
D. Est

##### QUESTION 8:

Refer to the exhibit:

SAS output from the RSQUARE selection method, within the REG procedure, is shown. The top two models in each subset are given.

Based on the AIC statistic, which model is the champion model?

A. Age Weight RunTime RunPulse MaxPulse
B. Age Weight RunTime RunPulse RestPulse MaxPulse
C. RestPulse
D. RunTime

##### QUESTION 9:

Refer to the confusion matrix:

Calculate the sensitivity. (0 – negative outcome, 1 – positive outcome)

Click the calculator button to display a calculator if needed.

A. 25/48
B. 58/102
C. 25/B9
D. 58/81

##### QUESTION 10:

An analyst fits a logistic regression model to predict whether or not a client will default on a loan. One of the predictors in the model is the agent, and each agent serves 15-20 clients each. The model fails to converge.

The analyst prints the summarized data, showing the number of defaulted loans per agent.

See the partial output below:

What is the most likely reason that the model fails to converge?

A. There is quasi-complete separation in the data.
B. There is collinearity among the predictors.
C. There are missing values in the data.
D. There are too many observations in the data.

##### QUESTION 11:

What is the default method in the LOGISTIC procedure to handle observations with missing data?

A. Missing values are imputed.
B. Parameters are estimated accounting for the missing values.
C. Parameter estimates are made on all available data.
D. Only cases with variables that are fully populated are used.

##### QUESTION 12:

FILL BLANK

Refer to the confusion matrix:

An analyst determines that loan defaults occur at the rate of 3% in the overall population. The above confusion matrix is from an oversampled test set (1 = default).

What is the sensitivity adjusted for the population event probability?

Enter your answer in the space below. Round to three decimals (example: n.nnn).

##### QUESTION 13:

Refer to the exhibit:

On the Gains Chart, what is the correct interpretation of the horizontal reference line?

A. the proportion of cases that cannot be classified
B. the probability of a false negative
C. the probability of a false positive
D. the prior event rate