Your verification ID is: guDlT7MCuIOFFHSbB3jPFN5QLaQ Big Computing: Cubist package for R on Cran

Wednesday, April 27, 2011

Cubist package for R on Cran

Today an R port of Cubist was released to Cran. It is another powerful tool among the R Packages. Below is an excerpt from the vignette:

Cubist is a rule-based model that is an extension of Quinlan's M5 model tree. A tree is grown where
the terminal leaves contain linear regression models. These models are based on the predictors used
in previous splits. Also, there are intermediate linear models at each step of the tree. A prediction
is made using the linear regression model at the terminal node of the tree, but is "smoothed" by
taking into account the prediction from the linear model in the previous node of the tree (which
also occurs recursively up the tree). The tree is reduced to a set of rules, which initially are paths
from the top of the tree to the bottom. Rules are eliminated via pruning and/or combined for
simpli cation.

This is explained better in Quinlan (1992). Wang and Witten (1997) attempted to recreate this
model using a "rational reconstruction" of Quinlan (1992) that is the basis for the M5P model in
Weka (and the R package RWeka).

Here is a good example of Cubist being used on Visual Data. Cubist was created by RuleQuest and their GPL version of Cubist is available for download there.