Skip to content

Instantly share code, notes, and snippets.

@BenHeubl
Created February 13, 2020 16:34
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save BenHeubl/1bd9cbf9383239559032d257e9f8fe28 to your computer and use it in GitHub Desktop.
Save BenHeubl/1bd9cbf9383239559032d257e9f8fe28 to your computer and use it in GitHub Desktop.
tut14
# read in the data from github repo:
income <- read.csv("https://raw.githubusercontent.com/selva86/datasets/master/income.csv")
set.seed(100)
# We shuffle row-wise:
incomeR <- income[sample(nrow(income)),]
#check rownames (see above screenshot)
colnames(incomeR)
# Here we replace NAs
incomeR <- incomeR %>%
mutate_if(is.factor, fct_explicit_na, na_level = 'Unknown') %>%
mutate(INCOME = as.factor(INCOME))
#install packages
install.packages('randomForest')
library(randomForest)
install.packages('ggRandomForests')
library(ggRandomForests)
# When running this bear in mind that it could take a minute or two
model_base <- randomForest(INCOME ~ ., data = incomeR, importance = TRUE)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment