An example of Cubist for prediction in R
July 13, 2015
I was impressed by how quickly it ran and how good the the results were.
Below is the code of how I did the work using the Caret and Cubist Package in R.
First, I added the Caret Pacakage and the Cubist Package:
## Loading required package: caret ## Loading required package: lattice ## Loading required package: ggplot2
Then I read in the data set. Here is the summary of the structure as well
## Loading required package: Cubist
predictors <- read.csv("trainPredictors.csv") predictors <-predictors[,-1] outcomes <- read.csv("trainOutcomes.csv") outcomes<- outcomes[,-1] dim(predictors)
I used caret to make a training and test set of the data. I chose this to be a 80/20 split. I also split out the outcomes from the predictors in both the training and test set
##  5000 254
Then I simply ran the model. Notice how quickly it ran
inTrain<-createDataPartition(y = outcomes, p= .80) inTrain<-unlist(inTrain) trainpredictors<-predictors[inTrain,] trainoutcomes<-outcomes[inTrain] testpredictors<-predictors[-inTrain,] testoutcomes<-outcomes[-inTrain]
Next I used that model to do a prediction on the test set
modelTree<- cubist(x = trainpredictors,y = trainoutcomes)
Finally I did an R^2 measure to see how it did
This is great result for not much effort!
##  0.840342