What is the most appropriate machine learning approach for this scenario?

The scenario is pretty simple, and I'm sure it's been done a million times. The problem is I don't know the terminology to find the correct resources on the web.

Scenario: I have an environment that can be described in terms of 5 parameters, including and input value A and an output variable B. There is a dataset containing 100 rows and values for each parameter.

The output B depends on A as well as the remaining environmental variables.

The goal is to find the best value for input A such that output B is minimized.

What does the solution for this problem look like? Is it Machine learning, neural networks, a mathematical optimization problem? How is this best approached?

Extension: if I didn't have a dataset in practice, how would I train a system to suggest different values for A until a minimum is reached? Can neural networks be applied here? Or are we talking about a looping procedure that does know maths in each operation until the output doesn't change much anymore?

I thought the generalization would make it more difficult to answer. What I am describing is a number of temperature/humidity measure measurements for both inside my house and outside. The input that I can control is the fan speed setting on my evaporative aircon and the output is the lounge temperature which I want to minimize.

During sample gathering, I don't care much about the output. The set of 100 values was arbitrary and more (and diverse) samples can be obtained.

The range of A is a fan speed with discrete values 1-6. Humidity is a percentage and temperature is 20-45 degrees.

Topic machine-learning-model optimization machine-learning

Category Data Science


This seems like a task for active learning where given a small set of sample data, the ML approach iteratively recommends new experiments to perform that are maximally informative until you reach a specific goal, in this case, minimization of B.

You could do this by training a model on the current data, then using a function to maximize the predicted output of the model. You would then run that experiment and add the results to the data set. You would repeat this until you observed convergent behaviour.


This certainly looks like a basic optimization problem. However, looking from the Machine Learning perspective, what you describe can be framed as a multilabel classification. In that case, you treat the values of A and B as your input, and try to predict the values of those five parameters, which you mentioned:

A, B -> x1, x2, x3, x4, x5

The only pitfall is that 100 records can be insufficient to train a proper classification model, but if you can obtain more training data, this approach seems solid to me.

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.