Export pandas to dictionary by combining multiple row values

I have a pandas dataframe df that looks like this

name    value1     value2
A       123         1
B       345         5
C       712         4
B       768         2
A       318         9
C       178         6
A       321         3

I want to convert this into a dictionary with name as a key and list of dictionaries (value1 key and value2 value) for all values that are in name

So, the output would look like this

{
 'A': [{'123':1}, {'318':9}, {'321':3}],
 'B': [{'345':5}, {'768':2}],
 'C': [{'712':4}, {'178':6}]
}

So, far I have managed to get a dictionary with name as key and list of only one of the values as a list by doing

df.set_index('name').transpose().to_dict(orient='list')

How do I get my desired output? Is there a way to aggregate all the values for the same name column and get them in the form I want?

Topic data-wrangling pandas python

Category Data Science


The to_dict() method sets the column names as dictionary keys so you'll need to reshape your DataFrame slightly. Setting the 'ID' column as the index and then transposing the DataFrame is one way to achieve this.

The same can be done with the following line:

>>> df.set_index('ID').T.to_dict('list')
{'p': [1, 3, 2], 'q': [4, 3, 2], 'r': [4, 0, 9]}

Better to use the groupby,

df.groupby('name')[['value1','value2']].apply(lambda g: g.values.tolist()).to_dict()

Building on @nemo's answer (above) which will be faster than the accepted solution, this will give the same output that you want:

def formatRecords(g):
    keys = ['value1', 'value2']
    result = []
    for item in g.values.tolist():
        item = dict(zip(keys, item))
        result.append(item)
    return result

df_dict = df.groupby('name').apply(lambda g: formatRecords(g)).to_dict()

df.groupby('name')[['value1','value2']].apply(lambda g: g.values.tolist()).to_dict()

if you need a list of tuples explicitly:

df.groupby('name')[['value1','value2']].apply(lambda g: list(map(tuple, g.values.tolist()))).to_dict()

Does this do what you want it to?

from pandas import DataFrame

df = DataFrame([['A', 123, 1], ['B', 345, 5], ['C', 712, 4], ['B', 768, 2], ['A', 318, 9], ['C', 178, 6], ['A', 321, 3]], columns=['name', 'value1', 'value2'])

d = {}
for i in df['name'].unique():
    d[i] = [{df['value1'][j]: df['value2'][j]} for j in df[df['name']==i].index]

This returns

  Out[89]: 
{'A': [{123: 1}, {318: 9}, {321: 3}],
 'B': [{345: 5}, {768: 2}],
 'C': [{712: 4}, {178: 6}]}

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.