How do you avoid 'analysis paralysis' when choosing a method to implement?
When you have multiple methods to accomplish a task, how do you choose which one to implement?
Topic project-planning management
Category Data Science
When you have multiple methods to accomplish a task, how do you choose which one to implement?
Topic project-planning management
Category Data Science
Adding to Dirk’s answer, a structured approach is to create what is known as an Algorithm/Model Test Harness.
See here: https://machinelearningmastery.com/create-algorithm-test-harness-scratch-python/
This allows you to pass in a variety of learning algorithms and quickly evaluate them against your set of quality metrics. These metrics can be determined upfront and based on both statistical and domain-specific notions of quality.
Ideally, you create a json structure containing learning algorithms, initial hyper-parameters, constraints, above mentioned qualifiers, etc. and pass it into your test harness. Everything gets evaluated within the harness and the output is a set it accuracies (and other qualifiers).
Obviously you need to consider runtimes if you decide to construct a test harness that does many combinations of your json contents. But the idea is to balance your upfront ideas about what is best with the more data-driven approach achieved using a test harness.
So start with smaller samples of your data and construct a model test harness that evaluates a decent number of combinations. This will help you narrow your choices, at which point you can start thinking more deeply with less upfront assumptions.
This is a very broad question but in general you can define some criteria that you want the method to meet (eg low error, efficiency, scalability) and then you score the methods, which one solves your problem best.
Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.