Can you recommend a machine learning challenge that is suitable for novices?

I am looking for a challenge that is suitable for a group of novices who want to learn the basics of data science and machine learning. The challenge should match the following criteria:

  • is based on a real application or is at least realistic
  • has a clearly defined goal and partial progress is measurable
  • includes a machine learning component, but also other aspects of data science
  • should be doable within 3 to 6 weeks
  • is suitable for novices
  • it should be an actual challenge in the sense that you cannot just look up near-optimal solutions from the internet

Topic kaggle education machine-learning

Category Data Science


Let me share my list of ML training resources:

  1. http://www.crowdanalytix.com/listContests
  2. http://datahack.analyticsvidhya.com/contest/all/
  3. http://www.chalearn.org/challenges.html (some links may be dead)

And I am sure you read this Reddit: http://www.reddit.com/r/MachineLearning/


I would highly recommend the platform - https://datahack.analyticsvidhya.com/contest/all/

AnalyticsVidhya is an amazing community for data science. Not only contests, but they also have technical articles on their blog (https://www.analyticsvidhya.com). Though I used to blog for them, but my opinion is not biased by that fact. I used to follow them for a couple of years before that and I felt lucky that I got a chance to contribute to the community.

In terms of interesting problems, I would recommend the following:

  1. The Smart Recruits

  2. The Creative Analyst

  3. Date Your Data

Sorry I'm new here and not allowed to post more than 2 links. You'll find these competitions on their website. The datasets are available and you can make submissions to benchmark yourself. Also, I would suggest sign up for their emails. They conduct short 2-3 day hackathons which offer great learning experiences.

Hope this help!


You've already answered yourself by tagging kaggle.

Let me share the two competitions that I have every new hire in my team go through- and then have them keep improving their solurion for the next six months. These are a very well curated set of problems for a budding data scientist!

  1. Titanic: Machine Learning from Disaster https://www.kaggle.com/c/titanic This one helps bridge the gap between an analyst and a data scientist, as well as eases you into the world of code from Excel.

  2. Digit Recognizer https://www.kaggle.com/c/digit-recognizer The MNIST dataset has prepared almost every data scientist for the real world of dirty data, in a sweet, cushy manner. It makes very real the fact that not always will we have structured and organized data, but it should not deter us from gaining insight from it!

Happy Coding!

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.