Your verification ID is: guDlT7MCuIOFFHSbB3jPFN5QLaQ Big Computing: What did I learn for R/Finance 2011

Monday, May 2, 2011

What did I learn for R/Finance 2011

The R/finance 2011 meeting was a huge success! All the talks were just great. I do not have the time to go through each talk one by one but I do feel there were a couple of themes that ran through the entire conference. The opening speaker, Mebane Faber, and the keynote speaker,  John Bollinger, touched on two topics near to my heart. The first is that in many cases the simplified model does nearly as well as the more complex one and in some case with fewer pitfalls. The second is that models are our attempt to describe reality, but they are not reality. Therefore there is always the possibility that the model is a bad fit for the reality that it is trying to model or there exists a deviation from the  model to the reality it is describing. Both phenomenons can be exploited for advantage. Never get blindly enamored with a model and approach things with an opening mind. These ideas carried pretty consistently throughout the conference.

Parallel or High Performance Computing for R are becoming a more and more important factor in analytic computing. I am not sure if it is because to the continue growth of data in general, the enterance of HPC into general awareness through the "cloud",  or because the really cool problems seems to exist on the edge of our current capability. I believe with the exposure of more users to HPC tools for R it is time to update the various pros and cons of each approach and to benchmark them against each other with a set of set typical data set and models. I do wonder if the recent problems on Amazons EC2 could will slow down the growth of cloud computing? Lost time is one issue here but the users that lost their data could be much more reluctant to take that risk in the future.

I was also amazed at the traction that Rstudio had among this group of experienced R users. I have always held the belief that experienced users of any software package shy away for IDEs and GUIs and prefer the simple interaction of command line coding. I felt IDE were the tool for new or mid-level users. In this case, I was wrong. Rstudio appears to provide benefit to the very experienced R user to the point they are willing to change away from what they are currently doing and learn this model tool.

I thought JD Long's Dr Seuss inspired talk was the most entertaining of the confernece. It takes some talent to do that and even more to do it well. His Segue for R package is pretty cool too. Flash talks are a great format, and I wish they were used more often

1 comment:

  1. After I wrote this post Hadley Wickham pointed me to a paper written by Dirk Eddelbuettel and all called "State-of-the-Art Parallel Computing with R" written in 2009. I think it is a good starting point. but at the least it needs to be updated and expanded.

    dirk.eddelbuettel.com/papers/parallelR_techRep.pdf

    ReplyDelete