What is the quantity sold for a specific fruit & country combination?

What is the algorithm that generates these potential quantities that meet the given criteria?

Essentially - there are number of quantities for a fruit and country combination. E.g:

    Country+Fruit   Potential Quantity
1   India+Apple     25
2   India+Apple     27
3   India+Banana    35
4   India+Banana    37
5   France+Apple    130
6   France+Apple    132
7   France+Banana   11
8   France+Banana   13
9   France+Banana   15
10  France+Cherry   88

For complete dataset click here.

Each Country has to be as close as possible to the following values for all fruits sold:

   Country                 Total Fruits Sold
1. Total India Fruits        1403
2. Total China Fruits        1370
3. Total England Fruits      1115
4. Total France Fruits       1169
5. Total Germany Fruits      1470

And total fruit quantity across all countries has to be as close as possible to the following values:

   Fruits           Total Fruits Sold Across All Countries:
1. Total Apples        508
2. Total Bananas       253
3. Total Cherries      982
4. Total Guavas        389
5. Total Kiwis         681
6. Total Oranges       489
7. Total Mangos        608
8. Total Strawberries   1060

An example combination that comes close to meeting the above criteria (having 5 countries and 8 fruits) is:

    Country Fruits      Matched Combination
1.  India Mango         120
2.  India Apple         40
3.  Germany Apple       20
4.  Germany Mango       80
.
.
40. France Mango        186

What is the algorithm that generates these potential quantities that meet this criterion?

Problem 1: do we have to do a brute force method of generating all possible combinations or is there a more efficient way?

Problem 2: how do we define "closeness" - the exact match is the best - what if there is no exact match - then what is the next best option?

Topic probabilistic-programming statistics data-mining machine-learning

Category Data Science


Minimize

If I understand your question correctly, you want to solve an optimization problem. Using two countries and two fruits as an example, the problem can be written as a linear program:

Minimize

$$\Delta = \delta_{Apple} + \delta_{Mango} + \delta_{India} + \delta_{China}$$

s.t.

$$X_{India, Apple} + X_{China, Apple} = 508 + \delta_{Apple}$$ $$X_{India, Mango} + X_{China, Mango} = 608 + \delta_{Mango}$$ $$X_{India, Apple} + X_{India, Mango} = 1403 + \delta_{India}$$ $$X_{China, Apple} + X_{China, Mango} = 1370 + \delta_{China}$$

A simple optimization problem like this can be solved using many software packages, including Excel. Note that it is likely that the solution is not unique.

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.