How to groupby and sum values of only one column based on value of another column

I have a dataset that has the following columns: Category, Product, Launch_Year, and columns named 2010, 2011 and 2012. These 13 columns contain sales of the product in that year. The goal is to create another column Launch_Sum that calculates the sum of the Category (not the Product) for each Launch_Year:

test = pd.DataFrame({
    'Category':['A','A','A','B','B','B'],
    'Product':['item1','item2','item3','item4','item5','item6'],
    'Launch_Year':[2010,2012,2010,2012,2010,2011],
    '2010':[25,0,27,0,10,0],
    '2011':[50,0,5,0,20,39],
    '2012':[30,40,44,20,30,42]
)}
Category Product Launch_Year 2010 2011 2012 Launch_Sum (to be created)
A item1 2010 25 50 30 52
A item2 2012 0 0 40 114
A item3 2010 27 5 44 52
B item4 2012 0 0 20 92
B item5 2010 10 20 30 10
B item6 2011 0 39 42 59

I did a groupby and sum, which does have the result I want (shown in bold in the table below), but I don't know how to pull out the specific result and assign it to a new column in the original dataset:

df.groupby(['Category']).sum()
Launch_Year 2010 2011 2012
Category
A 6032 52 55 114
B 6032 10 59 92

Sorry, I don't know how to show a groupby object in markdown, hope the table above still makes sense. The Launch_Year sums are irrelevant. The required sums are all there under the year columns. How do I get these sum values and assign them to the rows with corresponding Category and Launch_Year? I need to be able to do this effectively for a large (5000) number of categories and the year sales columns from 2010 to 2022.

Topic groupby pandas

Category Data Science


I figured a way to do it, but it doesn't look efficient at all. Regardless, the following code got me the results I needed.

test = pd.DataFrame({
    'Category':['A','A','A','B','B','B'],
    'Product':['item1','item2','item3','item4','item5','item6'],
    'Launch_Year':[2010,2012,2010,2012,2010,2011],
    '2010':[25,0,27,0,10,0],
    '2011':[50,0,5,0,20,39],
    '2012':[30,40,44,20,30,42]
)}

test_grouped = test.groupby(['Category']).sum()
test_grouped.reset_index(inplace=True)

for cat in test_grouped.Category:
    for year in test.Launch_Year.unique():
        test.loc[((test.Category == cat) & (test.Launch_Year == year)), 'Annual_Sum'] = test_grouped[test_grouped.Category == cat][str(year)].values[0]

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.