Using Pandas.groupby.agg with multiple columns and functions

I have a data frame which contains duplicates I'd like to combine based on 1 column (name). In half of the other columns I'd like to keep one value (as they should all be the same) whereas I'd like to sum the others.

I've tried the following code based on an answer I found here: Pandas merge column duplicate and sum value

df2 = df.groupby(['name']).agg({'address': 'first', 'cost': 'sum'}

The only issue is I have 100 columns, so would rather not list them all out. Is there a way to pass a tuple or list in the the place of 'address' and 'cost' above? Something along the lines of

column_list = df.columns.values.tolist()
columns_first = tuple(column_list[0:68])
columns_sum = tuple(column_list[68:104])

Topic groupby dataframe pandas

Category Data Science


You could perhaps generate the dictionary using a list comprehension style syntax. E.g.

df2 = df.groupby(['name']).agg({col: 'first' if i<68 else 'sum' for i, col in enumerate(df.columns)})

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.