How do you avoid 'analysis paralysis' when choosing a method to implement?

Question

How do you avoid 'analysis paralysis' when choosing a method to implement?

DBinJP

2018年1月19日 14:21

When you have multiple methods to accomplish a task, how do you choose which one to implement?

Topic project-planning management

Category Data Science

Cybernetic · Accepted Answer · 2018年1月19日 14:21

Adding to Dirk’s answer, a structured approach is to create what is known as an Algorithm/Model Test Harness.

See here: https://machinelearningmastery.com/create-algorithm-test-harness-scratch-python/

This allows you to pass in a variety of learning algorithms and quickly evaluate them against your set of quality metrics. These metrics can be determined upfront and based on both statistical and domain-specific notions of quality.

Ideally, you create a json structure containing learning algorithms, initial hyper-parameters, constraints, above mentioned qualifiers, etc. and pass it into your test harness. Everything gets evaluated within the harness and the output is a set it accuracies (and other qualifiers).

Obviously you need to consider runtimes if you decide to construct a test harness that does many combinations of your json contents. But the idea is to balance your upfront ideas about what is best with the more data-driven approach achieved using a test harness.

So start with smaller samples of your data and construct a model test harness that evaluates a decent number of combinations. This will help you narrow your choices, at which point you can start thinking more deeply with less upfront assumptions.

Dirk Nachbar · Accepted Answer · 2018年1月19日 13:42

1

Dirk Nachbar answered at 2018年1月19日 13:42

This is a very broad question but in general you can define some criteria that you want the method to meet (eg low error, efficiency, scalability) and then you score the methods, which one solves your problem best.

How do you avoid 'analysis paralysis' when choosing a method to implement?

About