Cubist is a machine learning algorithm for continous outcomes. Cubist is a rule-based decision tree that automatically deals with missing values. This makes using Cubist ideal for baselining the perdictive value of your data set because if it is messy with a lot missing values, you do not have to deal with it. Cubist has become my first-try model for all continous outcome data sets.

Cubist was developed by Quinlan, and the R package for Cubist is maintained by Max Kuhn who also maintains the Caret package.

The code for calling a Cubist model is fairly standard for most predictive models in R.

There are some other elements that help improve the basic Cubist model’s performance, but let’s start with the simple model and go from there. For this example, we are going to use the BostonHousing data set the is contained in the mlbench package. The Data comes from a 1978 paper by Harrison and Rubinfeld (“Hedonic Prices and the Demand for Clean Air,” Journal of Environmental Economics and Management, vol. 5, 1978, pp. 81-102). It is a very well-know data set with 506 rows and 19 variables. Let’s look at that data set before we move on to creating and evaluating a predictive model in R.

crim crime rate of town zn proportion of residential land zoned for lot over 25,000 sq.ft. indus proportion of non-retail business acres per town chas Charles River Dummey Variable ( = 1 if tract bounds Charles River, = 0 if not) nox nitrix oxides concentration in parts per 10 million rm average number of rooms per dwelling age proportion of owner occupied units built before 1940 dis weighted distances to five Boston Employment centers rad index of accessibility to radial highways tax full value property tax per USD 10,000 ptratio pupil to teacher ratio per town b 1000(B-0.63)^2 where B is the proportion of African Americans in the town lstat percentage of lower status of the population medv median value of owner-occupied homes in USD 1000’s

Normally when you build a predictive model, you break that data set into two or three data sets - training, test, and hold out data set. That may differ slightly if you are using cross-validation, but in general I make a training and a test set. Here I will use an 80/20 split . I am also going to do a little modification to the chas variable

So fit the model

Model Committes are created by generating a rule-based sequence of models similar to boosting. The number of committees can range from 1 to 100.

Let’s do a committee Cubist model with committees set to 100

Cubist was developed by Quinlan, and the R package for Cubist is maintained by Max Kuhn who also maintains the Caret package.

The code for calling a Cubist model is fairly standard for most predictive models in R.

`cubist( x= trainingpredictors, y = trainingoutcomes)`

There are some other elements that help improve the basic Cubist model’s performance, but let’s start with the simple model and go from there. For this example, we are going to use the BostonHousing data set the is contained in the mlbench package. The Data comes from a 1978 paper by Harrison and Rubinfeld (“Hedonic Prices and the Demand for Clean Air,” Journal of Environmental Economics and Management, vol. 5, 1978, pp. 81-102). It is a very well-know data set with 506 rows and 19 variables. Let’s look at that data set before we move on to creating and evaluating a predictive model in R.

`require(mlbench)`

`## Loading required package: mlbench`

`require(caret)`

```
## Loading required package: caret
## Loading required package: lattice
## Loading required package: ggplot2
```

`require(Cubist)`

`## Loading required package: Cubist`

```
data(BostonHousing)
dim(BostonHousing)
```

`## [1] 506 14`

`str(BostonHousing)`

```
## 'data.frame': 506 obs. of 14 variables:
## $ crim : num 0.00632 0.02731 0.02729 0.03237 0.06905 ...
## $ zn : num 18 0 0 0 0 0 12.5 12.5 12.5 12.5 ...
## $ indus : num 2.31 7.07 7.07 2.18 2.18 2.18 7.87 7.87 7.87 7.87 ...
## $ chas : Factor w/ 2 levels "0","1": 1 1 1 1 1 1 1 1 1 1 ...
## $ nox : num 0.538 0.469 0.469 0.458 0.458 0.458 0.524 0.524 0.524 0.524 ...
## $ rm : num 6.58 6.42 7.18 7 7.15 ...
## $ age : num 65.2 78.9 61.1 45.8 54.2 58.7 66.6 96.1 100 85.9 ...
## $ dis : num 4.09 4.97 4.97 6.06 6.06 ...
## $ rad : num 1 2 2 3 3 3 5 5 5 5 ...
## $ tax : num 296 242 242 222 222 222 311 311 311 311 ...
## $ ptratio: num 15.3 17.8 17.8 18.7 18.7 18.7 15.2 15.2 15.2 15.2 ...
## $ b : num 397 397 393 395 397 ...
## $ lstat : num 4.98 9.14 4.03 2.94 5.33 ...
## $ medv : num 24 21.6 34.7 33.4 36.2 28.7 22.9 27.1 16.5 18.9 ...
```

As you can, see it is a data set with 506 rows and 19 columns of all numeric values. We are going to try to predict the value of the last column (medv) which is the median value of owner-occupied homes in USD 1000’s. Here is a description of the data in each of the 19 columns.crim crime rate of town zn proportion of residential land zoned for lot over 25,000 sq.ft. indus proportion of non-retail business acres per town chas Charles River Dummey Variable ( = 1 if tract bounds Charles River, = 0 if not) nox nitrix oxides concentration in parts per 10 million rm average number of rooms per dwelling age proportion of owner occupied units built before 1940 dis weighted distances to five Boston Employment centers rad index of accessibility to radial highways tax full value property tax per USD 10,000 ptratio pupil to teacher ratio per town b 1000(B-0.63)^2 where B is the proportion of African Americans in the town lstat percentage of lower status of the population medv median value of owner-occupied homes in USD 1000’s

Normally when you build a predictive model, you break that data set into two or three data sets - training, test, and hold out data set. That may differ slightly if you are using cross-validation, but in general I make a training and a test set. Here I will use an 80/20 split . I am also going to do a little modification to the chas variable

```
BostonHousing$chas <- as.numeric(BostonHousing$chas) - 1
set.seed(1)
inTrain <- sample(1:nrow(BostonHousing), floor(.8*nrow(BostonHousing)))
trainingPredictors <- BostonHousing[ inTrain, -14]
testPredictors <- BostonHousing[-inTrain, -14]
trainingOutcome <- BostonHousing$medv[ inTrain]
testOutcome <- BostonHousing$medv[-inTrain]
```

Now all we have to do is fit the model, make a prediction and then evaluate the prediction. Since we are predicting a continous variable here, we will use Root Mean Squared Error (RSME).So fit the model

```
trainingPredictors <- BostonHousing[ inTrain, -14]
testPredictors <- BostonHousing[-inTrain, -14]
trainingOutcome <- BostonHousing$medv[ inTrain]
testOutcome <- BostonHousing$medv[-inTrain]
modelTree <- cubist(x = trainingPredictors, y = trainingOutcome)
modelTree
```

```
##
## Call:
## cubist.default(x = trainingPredictors, y = trainingOutcome)
##
## Number of samples: 404
## Number of predictors: 13
##
## Number of committees: 1
## Number of rules: 4
```

Look at the model`summary(modelTree)`

```
##
## Call:
## cubist.default(x = trainingPredictors, y = trainingOutcome)
##
##
## Cubist [Release 2.07 GPL Edition] Wed Feb 17 21:19:55 2016
## ---------------------------------
##
## Target attribute `outcome'
##
## Read 404 cases (14 attributes) from undefined.data
##
## Model:
##
## Rule 1: [88 cases, mean 13.81, range 5 to 27.5, est err 2.10]
##
## if
## nox > 0.668
## then
## outcome = 2.07 + 3.14 dis - 0.35 lstat + 18.8 nox + 0.007 b
## - 0.12 ptratio - 0.008 age - 0.02 crim
##
## Rule 2: [153 cases, mean 19.54, range 8.1 to 31, est err 2.16]
##
## if
## nox <= 0.668
## lstat > 9.59
## then
## outcome = 34.81 - 1 dis - 0.72 ptratio - 0.056 age - 0.19 lstat + 1.5 rm
## - 0.11 indus + 0.004 b
##
## Rule 3: [39 cases, mean 24.10, range 11.9 to 50, est err 2.73]
##
## if
## rm <= 6.23
## lstat <= 9.59
## then
## outcome = 11.89 + 3.69 crim - 1.25 lstat + 3.9 rm - 0.0045 tax
## - 0.16 ptratio
##
## Rule 4: [128 cases, mean 31.31, range 16.5 to 50, est err 2.95]
##
## if
## rm > 6.23
## lstat <= 9.59
## then
## outcome = -1.13 + 1.6 crim - 0.93 lstat + 8.6 rm - 0.0141 tax
## - 0.83 ptratio - 0.47 dis - 0.019 age - 1.1 nox
##
##
## Evaluation on training data (404 cases):
##
## Average |error| 2.27
## Relative |error| 0.34
## Correlation coefficient 0.94
##
##
## Attribute usage:
## Conds Model
##
## 78% 100% lstat
## 59% 53% nox
## 41% 78% rm
## 100% ptratio
## 90% age
## 90% dis
## 62% crim
## 59% b
## 41% tax
## 38% indus
##
##
## Time: 0.0 secs
```

Make a prediction`mtPred <- predict(modelTree, testPredictors)`

Get the RMSE`sqrt(mean((mtPred - testOutcome)^2))`

`## [1] 3.337924`

That is not bad, but we can do better using Committees and NeighborsModel Committes are created by generating a rule-based sequence of models similar to boosting. The number of committees can range from 1 to 100.

Let’s do a committee Cubist model with committees set to 100

```
set.seed(1)
committeeModel <- cubist(x = trainingPredictors, y = trainingOutcome, committees = 100)
## Get RMSE of COmmittee
comPred <- predict(committeeModel, testPredictors)
## RMSE
sqrt(mean((comPred - testOutcome)^2))
```

`## [1] 2.779002`

Now let’s add neighbors to the committees, which adjusts the model based adjacent solutions.```
instancePred <- predict(committeeModel, testPredictors, neighbors = 4)
sqrt(mean((instancePred - testOutcome)^2))
```

`## [1] 2.566348`

So now the question is, what combination of committees and neighbors yields the best prediction? We can answer that by creating a vector of possible committees, and a vector of possible neighbors, then seeing where the RSME is best.```
set.seed(1)
cTune <- train(x = trainingPredictors, y = trainingOutcome,"cubist",
tuneGrid = expand.grid(.committees = c(1, 10, 50, 100),
.neighbors = c(0, 1, 5, 9)),
trControl = trainControl(method = "cv"))
cTune
```

```
## Cubist
##
## 404 samples
## 13 predictor
##
## No pre-processing
## Resampling: Cross-Validated (10 fold)
## Summary of sample sizes: 363, 363, 363, 363, 362, 365, ...
## Resampling results across tuning parameters:
##
## committees neighbors RMSE Rsquared RMSE SD Rsquared SD
## 1 0 4.081800 0.7916640 1.3007653 0.15005686
## 1 1 4.111955 0.7950087 1.2540113 0.13995896
## 1 5 3.943515 0.8054412 1.2727587 0.14070680
## 1 9 3.959522 0.8022459 1.3305391 0.14884521
## 10 0 3.371765 0.8597818 0.9354412 0.08111265
## 10 1 3.370218 0.8681521 0.8462733 0.07253983
## 10 5 3.168392 0.8767757 0.9409569 0.07777561
## 10 9 3.207153 0.8725973 0.9499315 0.07980860
## 50 0 3.238911 0.8704658 0.9819922 0.08369843
## 50 1 3.257555 0.8741483 0.9284914 0.08006349
## 50 5 3.035711 0.8845178 1.0167411 0.08284853
## 50 9 3.071004 0.8810091 1.0233749 0.08444221
## 100 0 3.211165 0.8713608 1.0185290 0.08500905
## 100 1 3.254918 0.8739276 0.9853192 0.08458200
## 100 5 3.005851 0.8855715 1.0492541 0.08529563
## 100 9 3.044205 0.8820627 1.0572761 0.08671512
##
## RMSE was used to select the optimal model using the smallest value.
## The final values used for the model were committees = 100 and neighbors
## = 5.
```

As you can see, Cubist does really well as a predictive model.
Gene Williams doesn't plan out of a ample assembly facility, nor does replica watches UK he accept the latest accouterment on duke if it comes to architecture things like the Board Track Replica. What he does accept is an affectionate ability of motorcycles and the ability to accumulate his own creations.Board-track antagonism (so called because the egg-shaped advance were fabricated of board planks) was bell ross replica accepted in the aboriginal canicule of motorcycling as a analysis omega replica of both acceleration and endurance. To body this replica of a 1930s board-track racer, Williams started with a Harley-Davidson Knucklehead engine that he apathetic out and adapted with a high-performance cam and bifold replica rolex carburetors. A 1931 Harley VL archetypal donated its frame, which hadto be continued by an inch to fit the beyond engine. (The VL came with a flathead V-twin, which is added compact-but

ReplyDeleteTruely a very good article on how to handle the future technology. After reading your post,thanks for taking the time to discuss this, I feel happy about and I love learning more about this topic. keep sharing your information regularly for my future reference. This content creates a new hope and inspiration with in me. Thanks for sharing article like this. The way you have stated everything above is quite awesome. Keep blogging like this. Thanks.

ReplyDeleteAndroid training in chennai

Great post. This article is really very interesting and enjoyable. I think it’s must be helpful and informative for us. Thanks for sharing your nice post. study in australia consultant in jalandhar

DeleteWe experience more than happy about that. You should also consider their finest game titles with our free currently to find the sense of exciting that brings.

ReplyDeletehappy wheels demo | friv4school | girls go games | games 2 girls | happy wheels 2

Here I want to show you how to get a beautiful picture, if you are not a professional photographer. You try to use this app and give me feedback!

ReplyDeleteb612 , b612 , b612 , b612 , b612 , b612 , b612 , b612 , b612 , b612 , b612 , b612 , b612 , b612 , b612

To take a photograph is beautiful, there are many important factors. In particular, the use of applications that contributes most to make the shimmering picture

ReplyDeleteretrica , retrica , retrica , retrica , retrica , retrica , retrica , retrica , retrica , retrica , retrica , retrica , retrica , retrica , retrica , retrica