Your verification ID is: guDlT7MCuIOFFHSbB3jPFN5QLaQ Big Computing: R Benchmarks on a big data logistic regression

Friday, February 10, 2012

R Benchmarks on a big data logistic regression

We did some benchmarks of a logistic regression for a customer with a million rows data set. I know it is not a "BIG DATA" problem for many people, but slightly big sounds stupid. We ran the benchmarks on the following setup:


CPU: Intel Xeon X5570 2.93GHz
CPU cores: 8
RAM: 23GB
Network: 10-gigabit ethernet



We used standard open source R 2.14 with glm, Revolution R with glm and Revolution R with "rxlogit". The results were as follows:


R 2.14 with glm: 56 sec
RevoR with glm: 54 sec
RevoR with rxlogit: 7 sec


I can not really provide more information than that, but I think the results are compelling that in some cases Revolution R with rxlogit is far superior in terms of speed for problems of a certain size problems. 

3 comments:

  1. fantastic logistic regression. Thanks for share..

    ReplyDelete
  2. Hey there! I know this is kind of off-topic, but I’d figured I’d ask. Would you be interested in exchanging links or maybe guest authoring a blog post or vice-versa? My blog goes over a lot of the same topics as yours, and I believe we could greatly benefit from each other. If you happen to be interested, feel free to shoot me an e-mail. I look forward to hearing from you! Great blog by the way!
    Surya Informatics

    ReplyDelete