What do you think of Data Science certifications?

I've now seen two data science certification programs - the John Hopkins one available at Coursera and the Cloudera one.

I'm sure there are others out there.

The John Hopkins set of classes is focused on R as a toolset, but covers a range of topics:

  • R Programming
  • cleaning and obtaining data
  • Data Analysis
  • Reproducible Research
  • Statistical Inference
  • Regression Models
  • Machine Learning
  • Developing Data Products
  • And what looks to be a Project based completion task similar to Cloudera's Data Science Challenge

The Cloudera program looks thin on the surface, but looks to answer the two important questions - "Do you know the tools", "Can you apply the tools in the real world". Their program consists of:

  • Introduction to Data Science
  • Data Science Essentials Exam
  • Data Science Challenge (a real world data science project scenario)

I am not looking for a recommendation on a program or a quality comparison.

I am curious about other certifications out there, the topics they cover, and how seriously DS certifications are viewed at this point by the community.

EDIT: These are all great answers. I'm choosing the correct answer by votes.

Topic education

Category Data Science


Some resources on edX for data science courses from Harvard, MIT, Microsoft and more that might be of interest to this group.

For example, we have a professional certificate program from Harvard consisting of 8 courses and a capstone exam here.

For more advanced studies, we have a MicroMasters program from MIT here.

as well as one from UC San Diego here. For a great overview of Data Science, we have a program from Microsoft. For all of our programs you can check out here.

Hope this helps,

Josh from edX


Value to student, mixed bag. Paying several hundred dollars for a program or a hundred a pop for a course is a motivator.

I've completed one series, from MITx. It's a graduate survey course of methods and tools aimed at those who need to "know about" in some detail. It's enough grounding, that I've felt comfortable applying what I've learned.

A stand-alone HarvardX course on the directed acyclic graph method was more like a graduate seminar in statistics on the Judea Pearl method. It would have been a long time before I heard about it, otherwise.

The HarvardX series is a graduate level boot camp aimed at orienting the new student to the R toolset and applications.

The BerkeleyX series is an undergraduate survey course using a purpose built Python class that's almost a domain specific language.

As to the value of the certificates, I can only report that my only related educational experience was a master's in geophysics, and I had about a year of paid experience outside my job description (senior bank lawyer).

Perhaps as a result of the certificates, I've been turned down as "overqualified" for at least two jobs I know about. So, my advice is that if you have a certificate don't mention it if the word "Excel" appears in the job posting.


It really depends on the credibility of the institution granting the certificate. For example, Data Science Certification from a Harvard-based company is recognized by many industry partners and may make a good choice. You did not say what kind of certificate you are looking for?


The best way to be successful at getting the job that you want it to show that you can do it.

The MOOCs that you mention will give you a good grounding in the basics and should be enough to get you started solving your own machine learning/data science problems. Try a Kaggle competition or two, that is a great way to improve your skills, and a decent grade there will be of interest to a potential employer. Publish your results on Github using something like an iPython Notebook, which will allow your work to be easily seen and judged.

Try an analysis on other public data sets, like the UCI Bike Sharing Dataset, or the UCI Diabetes Treatment Dataset those are lots of fun to try, and show that you are keen and willing to develop your skills.


I am almost done with Johns Hopkins Data Science Specialization on Coursera (A course and a capstone left to graduate). I will just give you the pros and cons of it, trying to keep it as objective as possible:

Pros:

  • Structure around the learning process
  • You'll build a portfolio over time

Cons:

  • Different backgrounds needed for different courses. The first few courses don't assume previous knowledge. It suddenly gets not easy to understand in the conceptual courses. (Statistical Inference, Regression Analysis)
  • Taught by 3 professors. I think they are not on the same page about their potential audience and their abilities/needs/interests.

I think the effect of the certification from coursera is dependent on the individual as well as the classes. The requirement says min 3-5 hours a week, if you put more, and the material do open up for a lot more than the 3-5 hours, then these classes and certifications can be equivalent to strong knowledge base and experience in the field. Science comes to those who request it.


@OP: Choosing answers by votes is the WORST idea.

Your question becomes a popularity contest. You should seek the right answer, I doubt you know what you are asking, know what you are looking for.

To answer your question:
Q: how seriously DS certifications are viewed at this point by the community.

A: What is your goal from taking these courses? For work, for school, for self-improvement, etc? Coursera classes are very applied, you will not learn much theory, they are intentionally reserved for classroom setting.

Nonetheless, Coursera classes are very useful. I'd say it is equivalent to one year of stat grad class, out of a two year Master program.

I am not sure of its industry recognition yet, because the problem of how did you actually take the course? How much time did you spend? It's a lot easier to get A's in these courses than a classroom paper-pencil exam. So, there is be a huge quality variation from person to person.


As a former analytics manager and a current lead data scientist, I am very leery of the need for data science certificates. The term data scientist is pretty vague and the field of data science is in it's infancy. A certificates implies some sort of uniform standard which is just lacking in data science, it is still very much the wild west.

While a certificate is probably not going to hurt you, I think your time would be better spent developing the experience to know when to use a certain approach, and depth of understanding to be able to explain that approach to a non-technical audience.


I lead data science teams for a major Internet company and I have screened hundreds of profiles and interviewed dozens for our teams around the world. Many candidates have passed the aforementioned courses and programs or bring similar credentials. Personally, I have also taken the courses, some are good, others are disappointing but none of them makes you a "data scientist".

In general, I agree with the others here. A certificate from Coursera or Cloudera just signalizes an interest but it does not move the needle. There is a lot more to consider and you can have a bigger impact by providing a comprehensive repository of your work (github profile for example) and by networking with other data scientists. Anyone hiring for a data science profile will always prefer to see your previous work and coding style/abilities.


There are multiple certifications going on, but they have different focus area and style of teaching.

I prefer The Analytics Edge on eDX lot more over John Hopkins specialization, as it is more intensive and hands on. The expectation in John Hopkins specialization is to put in 3 - 4 hours a week vs. 11 - 12 hours a week on Analytics Edge.

From an industry perspective, I take these certifications as a sign of interest and not level of knowledge a person possesses. There are too many dropouts in these MOOCs. I value other experience (like participating in Kaggle competitions) lot more than undergoing XYZ certification on MOOC.


Not sure about the cloud era one, but one of my friends joined the John Hopkins one and in his words it's "brilliant to get you started". It has also been recommended by a lot of people. I am planning to join it in few weeks. As far as seriousness is concerned, I don't think these certifications are gonna help you land a job, but they sure will help you learn.


I did the first 2 courses and I'm planning to do all the others too. If you don't know R, it's a really good program. There are assignments and quizzes every week. Many people find some courses very difficult. You are going to have hard time if you don't have any programming experience (even if they say it's not required).

Just remember.. it's not because you can drive a car that you are a F1 pilot ;)


The certification programs you mentioned are really entry level courses. Personally, I think these certificates show only person's persistence and they can be only useful to those who is applying for internships, not the real data science jobs.

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.