Importance of variables in a random forest — measure_importance
Get a data frame with various measures of importance of variables in a random forest
measure_importance(forest, mean_sample = "top_trees", measures = NULL)Arguments
- forest
A random forest produced by the function randomForest with option localImp = TRUE
- mean_sample
The sample of trees on which mean minimal depth is calculated, possible values are "all_trees", "top_trees", "relevant_trees"
- measures
A vector of names of importance measures to be calculated - if equal to NULL then all are calculated; if "p_value" is to be calculated then "no_of_nodes" will be too. Suitable measures for
classificationforests are:mean_min_depth,accuracy_decrease,gini_decrease,no_of_nodes,times_a_root. Forregressionforests choose from:mean_min_depth,mse_increase,node_purity_increase,no_of_nodes,times_a_root.
Value
A data frame with rows corresponding to variables and columns to various measures of importance of variables
Examples
forest <- randomForest::randomForest(Species ~ ., data = iris, localImp = TRUE, ntree = 300)
measure_importance(forest)
#> variable mean_min_depth no_of_nodes accuracy_decrease gini_decrease
#> 1 Petal.Length 0.8993289 796 0.332686952 46.623867
#> 2 Petal.Width 1.1048546 721 0.272628994 39.578597
#> 3 Sepal.Length 2.2073714 499 0.039987377 10.525047
#> 4 Sepal.Width 3.2914989 348 0.009315119 2.478699
#> no_of_trees times_a_root p_value
#> 1 298 132 2.687474e-21
#> 2 296 107 8.694212e-10
#> 3 251 61 9.999961e-01
#> 4 218 0 1.000000e+00