I said once that R was like being in love. I don’t remember where I said that, but I guess in that moment it made sense… or probably, as it’s happening now, it didn’t… maybe it’s just that I enjoyed learning R so much that it became part of absurd conversations in which this statistical programming language was the subject of many jokes.
Definitely if you’re not a geek (I like to call it freak in Spanish), the jokes about R (no matter how good they are… some were great) will seem like bad ones to you, but this post is not about humour (I know, you might have noticed), this is about learning. I have wanted to write this for a while because people in general believe that if you are not a geek or don’t have a technical background, learning R is very difficult. Well, it’s not. Wanting to learn is more important than whatever your background is.
I did the degree in Advertising in university, which is not technical at all. I think I chose that degree because back then I enjoyed writing and journalism had high unemployment. I won’t go into detail about my Advertising degree because I’m trying to sound positive here, but my point is that my background wasn’t scientific at the beginning and it has become more as I’ve spent more years working in data analysis. From my point of view, being a good analyst doesn’t rely on being more scientific or artistic, it depends on having an analytical mindset.
Learning R is much easier if you have some programming knowledge, and as I didn’t have any, I was bit scared of starting the Data Science specialization that I’ve been studying for the last 4 months. The specialisation is offered by Johns Hopkins University through Coursera. It has 9 courses plus a final project, and now I can say (after 4 months and with no programming knowledge at the beginning), than I’ve almost finished all the courses (currently doing the last one).
If you want to learn R, I totally recommend this specialization. Also, as it’s through Coursera, you can do it for free. The only difference between the free and paid version is that you don’t get a certificate at the end with the free version.
In a different post I’ll explain in more detail each of the courses, however there are some things that all have in common. They all are 4 weeks long, and during each of these weeks there are videos available that you’ll have to watch to do the quizzes and assignments. The weekly videos can be watched in 1 or 2 hours (per week), and the ideal way to go is to do the quiz about those weekly videos after finishing watching them. Sometimes after going through so many videos, the brain is pretty dead and it’s better to wait until the next day. Unless you enjoy those moments when you read and reread the same all the time without having any idea of what it says because your concentration decided to go somewhere else.
Apart from the quizzes (which are usually 4, one per week), each course has assignments or small projects. Most of courses have one assignment only that is has to be due by the end of the third week. Most of them are also graded by the other students, so it’s very important to read the questions that need to be answered in each assignment (you can find the questions in the page where the assignment is submitted). The other students are usually very nice when grading, but if you’re missing any of the answers and for that they can’t give you all the points, obviously they won’t. You will also have to grade 4 of the other students’ assignment during the 4th week. It’s a great way to learn because you get to see different ways to solve the same problem.
I’ve done 2 courses each month, so I’ll be done in less than 5 months, and after that I’ll have to wait until the final project of the specialization (the Data Capstone) is available. You can’t start the project until the 9 courses are completed. Doing 2 courses each month requires time but is totally feasible, also it helps that 1 of the courses is always easier than the other.
The specialization has 3 teachers, Roger D. Peng, Jeff Leek, and Brian Caffo. Roger D. Peng teaches in the first 5 courses, which are specifically about R programming, about specific functions to clean the data, organise it, do easy calculations, plots, etc… I know I’m summarising a lot, but roughly that’s what they are about.
Brian Caffo teaches the courses about statistics (Statistical Inference and Regression Models), and in the last course (Developing Data Products). I’ve read online some critics about the way he teaches, and I have to agree that sometimes he makes the concepts more difficult than they really are, but statistics is not an easy subject to teach. He focuses too much on the mathematical explanation behind the formulas used to calculate the statistical concepts explained, and this is often unnecessary. Even though it is important, it doesn’t seem necessary due the nature of the course. Also, the mathematical part is optional in the course about Regression Models, and here, Brian says to skip the video if the student is not interested in learning about that. Despite all that, I wouldn’t say he’s a bad teacher at all, and that’s proved in the Developing Data Products course.
Jeff Leek is the teacher responsible of the second last course, Practical Machine Learning, which shows how to build predictive models in R on a more practical way than the courses about statistics do.
I personally liked the 3 teachers. The 3 of them show passion about what they do and it’s always a pleasure to listen to someone talking about something that they are passionate about.
R Programming, Regression models and Practical Machine Learning were my favourite courses. I did the last 2 at the same time during Christmas, and thanks to being on holidays, I could spend more time on them. Otherwise my advice is to not to do them together, because, with Statistical Inference, they take longer than the other courses.
R was an important part of the last months of my 2015 and I hope it keeps being that way during my 2016. It’s been long since the last time I enjoyed so much learning something new. R is not like being in love, but it leaves you that great sensation of feeling passionate for something does.