How to build a model where multiple data points contribute to a result

I’m trying to figure out how to massage data and model the following scenario:

Customers at a restaurant rate the quality of the service between 1-10.

I have data on individual interactions between the servers and customers. Say - length of interaction, type of interaction (refilling beverage, ordering, cleaning, etc).

Hypothesis here is each interaction contributes to the final score. I want to build a model that tells me given an interaction, how does it move the score.

My intuition is if I arranged the data as individual interactions, with output of final score, that’ll give me what I want. Is that true?

Topic machine-learning-model dataset machine-learning

Category Data Science


Seems like you want to classify interactions as bad or good. bad moves the score down, good moves the score up. If you can arrange the data to have individual interactions and their labels ( good or bad ) , you can predict future interactions' labels.


A few remarks on your framework. Do you want to predict the scores (1) or do you want to evaluate to what extent each input variable (independent variables) contributes to estimate the score (2)? Please clarify.

(1) Prediction: Random Forests and Neural Networks may help you to predict scores (please keep in mind every statistical model which is used for estimation (2) can also be used for predictions)

(2) Estimation/Variable Selection: Principal Component Analysis (PCA), LASSO/Ridge Regression (performs variable selection), Multiple Linear Regression (p-Values of your estimates are not necessarily a good indicator for variable selection)


You can evaluate the importance of each feature, if you have a model, predicting the score. In this model feature-vector would consist of all possible interactions, and the target would be the score. You should consider each interaction in context of others. Therefore, you can't predict score when your input is just one interaction and there are not another information. If it is so (you have just one interaction consider to), you can just say, how it's value changes will affect the final score in comparison with other interactions.

The other task could be to predict the value of each interaction (would be the target), having score and other interactions as a feature-vector. Then you should have a separate model for each interaction. However, would it be useful for your applied purpose?


My intuition is if I arranged the data as individual interactions, with output of final score, that’ll give me what I want. Is that true?

If I understand correctly, your goal is to predict the final score that a customer gives at the end of their dinner based on all the interactions they had during the dinner, right? If yes then I don't think you can keep individual interactions as instances, because then (1) the model would only be able to use one interaction to predict the score and (2) the model would predict a different score for each interaction.

So each instance should represent the full meal and somehow contain as features all the possible information extracted from the interactions. For example an instance could contain the number of interactions, total length, number of refills, etc.

(side note: I wouldn't like eating in a restaurant where customers and staff are monitored so closely, but maybe that's just me)

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.