Extract k most important variables in a random forest — important_variables
Get the names of k variables with highest sum of rankings based on the specified importance measures
important_variables(
importance_frame,
k = 15,
measures = names(importance_frame)[2:min(5, ncol(importance_frame))],
ties_action = "all"
)Arguments
- importance_frame
A result of using the function measure_importance() to a random forest or a randomForest object
- k
The number of variables to extract
- measures
A character vector specifying the measures of importance to be used
- ties_action
One of three: c("none", "all", "draw"); specifies which variables to pick when ties occur. When set to "none" we may get less than k variables, when "all" we may get more and "draw" makes us get exactly k.
Value
A character vector with names of k variables with highest sum of rankings
Examples
forest <- randomForest::randomForest(Species ~ ., data = iris, localImp = TRUE, ntree = 300)
important_variables(measure_importance(forest), k = 2)
#> [1] "Petal.Width" "Petal.Length"