Using Pandas.groupby.agg with multiple columns and functions

Question

Using Pandas.groupby.agg with multiple columns and functions

RobW

2022年3月10日 20:56

I have a data frame which contains duplicates I'd like to combine based on 1 column (name). In half of the other columns I'd like to keep one value (as they should all be the same) whereas I'd like to sum the others.

I've tried the following code based on an answer I found here: Pandas merge column duplicate and sum value

df2 = df.groupby(['name']).agg({'address': 'first', 'cost': 'sum'}

The only issue is I have 100 columns, so would rather not list them all out. Is there a way to pass a tuple or list in the the place of 'address' and 'cost' above? Something along the lines of

column_list = df.columns.values.tolist()
columns_first = tuple(column_list[0:68])
columns_sum = tuple(column_list[68:104])

Topic groupby dataframe pandas

Category Data Science

namiyousef · Accepted Answer · 2022年3月10日 11:37

1

namiyousef answered at 2022年3月10日 11:37

You could perhaps generate the dictionary using a list comprehension style syntax. E.g.

df2 = df.groupby(['name']).agg({col: 'first' if i<68 else 'sum' for i, col in enumerate(df.columns)})

Using Pandas.groupby.agg with multiple columns and functions

About