Your verification ID is: guDlT7MCuIOFFHSbB3jPFN5QLaQ Big Computing: R Benchmarks on a big data logistic regression

Friday, February 10, 2012

R Benchmarks on a big data logistic regression

We did some benchmarks of a logistic regression for a customer with a million rows data set. I know it is not a "BIG DATA" problem for many people, but slightly big sounds stupid. We ran the benchmarks on the following setup:

CPU: Intel Xeon X5570 2.93GHz
CPU cores: 8
Network: 10-gigabit ethernet

We used standard open source R 2.14 with glm, Revolution R with glm and Revolution R with "rxlogit". The results were as follows:

R 2.14 with glm: 56 sec
RevoR with glm: 54 sec
RevoR with rxlogit: 7 sec

I can not really provide more information than that, but I think the results are compelling that in some cases Revolution R with rxlogit is far superior in terms of speed for problems of a certain size problems.