Quick Question
What is the average age of patients in the training set, ClaimsTrain?
What proportion of people in the training set (ClaimsTrain) had at least one diagnosis code for diabetes?
Explanation
Both of these answers can be found by looking at summary(ClaimsTrain). The mean age should be listed under the age variable, and since diabetes is a binary variable, the mean value of diabetes gives the proportion of people with at least one diagnosis code for diabetes.
Alternatively, you could use the mean, table, and nrow functions:
mean(ClaimsTrain$age)
table(ClaimsTrain$diabetes)/nrow(ClaimsTrain)