I’m a huge fan of online classes. I like the self-paced nature, the flexibility, and the autonomy. With the right instructor, the right materials, and the right kind of student, I believe an online course can be far superior to the traditional in-person class. So with that attitude I signed up for Computing for Data Analysis on Coursera.
If you’re not familiar with Coursera, it’s a site that offers free online classes taught by big-time college professors. I’m talking MIT, Princeton, Stanford, etc. As far as I can tell, the classes being taught are actually designed for Coursera. In other words, this isn’t just videos of classes taught back in 2009. My class was taught by Roger Peng from Johns Hopkins.
Computing for Data Analysis was described as follows: “In this course you will learn how to program in R and how to use R for effective data analysis”. Sold. My background is in statistics, not programming. I’ve always known just enough about R to sometimes (eventually, frustratingly) get what I need. This seemed like the kind of class I needed to build a strong foundation in R and fill in the gaps in my knowledge.
The course ran for four weeks. Each week a set of lecture videos were unveiled with accompanying PDF slides. There was also a 10-question quiz each week that you could try 3 times. In addition there were two programming assignments, one due at end of week 2 and the other at the end of the course. No materials were required. You just needed to download and install R, which is free. According to Dr. Peng, over 40,000 people signed up for the class.
So how did it go? Pretty good! On a scale of 1 -10 I would give it an 8. The lectures were perfectly accessible. If you knew nothing about R or statistics or programming and were determined, you could easily follow along. The quizzes were fair and useful. The programming assignments were tough but doable. If you had questions, there was a lively discussion forum for help. I didn’t use it myself but I lurked to see what kinds of questions people had. It appeared that people who needed assistance were getting it. For the price (free!), this was an awesome class. It did what I hoped it would: fill in some gaps and show me better ways of doing things in R.
Minor points that kept me from giving the a class a 10:
- The videos would frequently hang and stop if I tried to rewind a portion. Fortunately you could download them and watch them locally. But if you did that you lost the interactive mini-quizzes to test your comprehension.
- The programming assignments centered around writing R functions, which is useful I guess, but seemed to detract from the actual data analysis. I felt like I spent too much time debugging functions and not actually doing data analysis.
- Some of the lectures were a little too esoteric in my opinion. One particularly long one covered the distinction between S3 and S4 classes. No doubt an important topic for someone planning for a career in R programming, but probably over the head of most people in this introductory class.
Major points that I loved about the class:
- The wonderfully lucid explanation of the *apply functions. That alone made the class worth it for me.
- The fantastic introduction to regular expressions. This is something I’ve been wanting for a long time: a clear gentle intro to regular expressions.
- The extensive overview of debugging techniques. I have quite a few R books, including R in a Nutshell, R Cookbook, and Statistical Analyses Using R. None of them seriously touch on debugging.
For posterity, I combined all the course slides into one PDF file:Â Computing for Data Analysis course slides
Download it and browse through it. If you’re interested in doing data analysis with R, this is a must-have.
Pingback: » Machine Learning for Hackers, Chapter 4 Statistics you can Probably Trust
Thank you very much for compiling the slides. Very useful.
Found this post looking for the combined course slides, thanks! I just took the course, and this is a good overall review. One point re: your critical point #2…this class is explicitly designed to focus on the computing, rather than the data analysis. It acts as a sort of lead-in to Jeff Leek’s coursera course, “Data Analysis,” which focuses primarily on the statistical and analytical aspects. It seems a little unfair to criticize it on those grounds when Roger Peng states that analysis is not the focus of the course and recommends the data analysis course.
I found Jeff Lee (Data Analysis) and his staff annoying and arrogant. On the other hand Computing for Data Analysis was fun and I enjoy it a lot.