12

I have a pandas dataframe df that looks like this

name    value1     value2
A       123         1
B       345         5
C       712         4
B       768         2
A       318         9
C       178         6
A       321         3

I want to convert this into a dictionary with name as a key and list of dictionaries (value1 key and value2 value) for all values that are in name

So, the output would look like this

{
 'A': [{'123':1}, {'318':9}, {'321':3}],
 'B': [{'345':5}, {'768':2}],
 'C': [{'712':4}, {'178':6}]
}

So, far I have managed to get a dictionary with name as key and list of only one of the values as a list by doing

df.set_index('name').transpose().to_dict(orient='list')

How do I get my desired output? Is there a way to aggregate all the values for the same name column and get them in the form I want?

sfactor
  • 223
  • 1
  • 2
  • 6

4 Answers4

10

Does this do what you want it to?

from pandas import DataFrame

df = DataFrame([['A', 123, 1], ['B', 345, 5], ['C', 712, 4], ['B', 768, 2], ['A', 318, 9], ['C', 178, 6], ['A', 321, 3]], columns=['name', 'value1', 'value2'])

d = {}
for i in df['name'].unique():
    d[i] = [{df['value1'][j]: df['value2'][j]} for j in df[df['name']==i].index]

This returns

  Out[89]: 
{'A': [{123: 1}, {318: 9}, {321: 3}],
 'B': [{345: 5}, {768: 2}],
 'C': [{712: 4}, {178: 6}]}
ignoring_gravity
  • 793
  • 4
  • 15
4

The to_dict() method sets the column names as dictionary keys so you'll need to reshape your DataFrame slightly. Setting the 'ID' column as the index and then transposing the DataFrame is one way to achieve this.

The same can be done with the following line:

>>> df.set_index('ID').T.to_dict('list')
{'p': [1, 3, 2], 'q': [4, 3, 2], 'r': [4, 0, 9]}

Better to use the groupby,

df.groupby('name')[['value1','value2']].apply(lambda g: g.values.tolist()).to_dict()
Aditya
  • 2,520
  • 2
  • 17
  • 35
3
df.groupby('name')[['value1','value2']].apply(lambda g: g.values.tolist()).to_dict()

if you need a list of tuples explicitly:

df.groupby('name')[['value1','value2']].apply(lambda g: list(map(tuple, g.values.tolist()))).to_dict()
Stephen Rauch
  • 1,831
  • 11
  • 23
  • 34
nemo
  • 31
  • 1
0

Building on @nemo's answer (above) which will be faster than the accepted solution, this will give the same output that you want:

def formatRecords(g):
    keys = ['value1', 'value2']
    result = []
    for item in g.values.tolist():
        item = dict(zip(keys, item))
        result.append(item)
    return result

df_dict = df.groupby('name').apply(lambda g: formatRecords(g)).to_dict()
cyram
  • 101
  • 1