Sunday, May 31, 2015

Recently Data Camp has really expanded their offering fof R tutorials. So I thought the time has come to re-look at the courses offered on Coursera for Data Science compared to those offered on Data Camp. This review is solely based on my own experience.

I took the courses in the Coursera Data Science series last year. Usually I enrolled in a class or two at a time. Each course in the nine course series was a month long and consisted of a series of lectures, some quizzes and one or two projecst which had the student work on a piece of data and submit the results for evaluation that could include automated grading or peer review. The courses started with setting up R, Github and Rstudio. The courses then go on to cover data visualization, data manipulation, regression and some machine learning. I found the courses to be a good base overview of the skills and tools needed to work in R as a data scientist. My greatest concern is the hardest parts are the first few classes that set everything up. After that I found the classes to be pretty easy. In fact, I am concerned that many of the students who fail to finish this series do so because they can not even get started. The forums are a good source of information and help on stuff. I found them really important when I got stuck. The other down side was the peer review of your projects. I found few reviewers spent much time and effort on this part of the class, and their reviews were in many cases not helpful or just plain wrong. There was a case where I did a project incorrectly yet all my reviewers gave me full credit, and I had another project that I did differently but properly than many of the other students, but received poor reviews because what I did did not look like the projects of my reviewers.

The Data Camp courses are different from the Coursera classes in that you are running R in the Data Camp environment. This is a benefit because you do not have to go through the work of setting us all the things you need to do this work, but has the same down side that you really are only learning how to do this stuff on the data camp site and not in the real world. I did enjoy the interactive and step by step method of learning examples that is the core of the data camp method. I did not like that the interface really requires the work you do to be in the exact format that the teacher used. This could be very vexing at times.

At this stage I would more strongly recommend the Coursera class because they really get you ready to do real work. However, if you are frustrated or getting stuck with the Coursera series. Do some modules on Data Camp. It will function as a remedial trainer and up your skill and confidence to take on more challenging and more independent tasks.


  1. What I find difficult to ascertain is what existing knowledge you should have before doing these courses, particularly on the maths side. It's not clear whether the course are an introduction to R for people who have some familiarity with the maths or will act as an introduction to both.

    I'm coming at this area as a computer programmer so I basically have high school maths and a small amount of (largely forgotten) statistics etc from doing a Chemical Engineering degree in the distant past.

    I imagine I could pick up the R parts reasonably quickly (even though it's obviously rather esoteric and different compared to general programming languages). I have good 'domain knowledge' in the area I work in. I wonder whether these courses would arm me with enough knowledge on the statistics side to do enough analysis to produce something practically useful for me work (and justify learning more about statistics, machine learning etc).

  2. If you have a computer science background I would go right to the Coursera Data Science Courses. Their Machine Learning and Regression class are both very good starters. The Intro to statistics class on course is also excellent to get you up to speed on classical statistics.

  3. Nice review, thanks!

    The Data Camp courses look nice. I didn't know all the exercises were done in the browser. I've just given it a try. I can see it's a good idea and could be very helpful but for me it would be a real bind.

    (Having said that I can see myself sneaking back and doing another couple of exercises every now and then. Do you know how well it works on an iPad?)

    I'm a programmer with some maths background but little familiarity with R (though I like the feel of the language). I'm going to be working with R (and R programmers) a lot more, so I'm looking for a solid foundation.

    Best wishes


    1. Actually, having had more of a look round, datacamp is in the lead (I don't want to sit watching videos). That and swirl.


