Your verification ID is: guDlT7MCuIOFFHSbB3jPFN5QLaQ Big Computing: What software is the best at Predictive Analytics Competitions? R, Salford Systems, SAS, SPSS or other?

Monday, July 18, 2011

What software is the best at Predictive Analytics Competitions? R, Salford Systems, SAS, SPSS or other?

I was reading my copy of Amstatnews this morning, and I came across an Ad for Salford Systems' SPM. In the Ad they make the claim that "Salford System' tools have dominated the fiercely contested field of data mining competitions for nearly a decade. Since 2000, no other vendor has come close to our record of consistent out-performance." I thought that was a fairly bold claim and did not mesh with my take on what platforms were being used and experiencing success in these competitions. So I decided to dig a little.

Kaggle.com which is probably the most dominate site in terms of these competitions posted their data on what software their contestants use. The most used platform is R, and Salford Systems is not even on the list. This is hardly surprising given the number of users of open source R versus the number of users of commercial Salford Systems. My understanding based on conversations with the people at Kaggle.com is that the majority of their contests are won with R. In fact according to Revolution Analytics ,which supports open source R,  R has won 50% of all Kaggle.com competitions.

So the software vendors are all claiming that they are the best. That is not a surprise. Which took me back to Kaggle.com's breakout of competitors. They use things like regression and SVM. I do not think regression on one analytics platform should yield different results than that same computation on another platform. In fact if it did I would be concerned. Maybe the credit for winning these competitions should go to the competitors who come up with the winning solutions rather than the tools they choose to use.

No comments:

Post a Comment