Create new rows based on a value in a column

My dateset is generated like the example

df = {'event':['A','B','C','D'],
     'budget':['123','433','1000','1299'],
     'duration_days':['6','3','4','2']}

I need to create rows for each event based on the column 'duration_days', if I have duration = 6 the event may have 6 rows:

event budget duration_days
A 123 6
A 123 6
A 123 6
A 123 6
A 123 6
A 123 6
B 123 3
B 123 3
B 123 3

Topic data-science-model dataframe python-3.x pandas dataset

Category Data Science


First of all, I think your dataset have some problem because if you use single quote like this '['1','2','3','4']' in the value for the key then python will show you a syntax error.

So I will take the as below

df = {'event':['A','B','C','D'],'budget':['123','433','1000','1299'],'duration_days':['6','3','4','2']}

Then convert it to a data frame as your requirement.

data = []

for i in range(len(df['event'])):
    for j in range(int(df['duration_days'][i])):
        temp = [df['event'][i], df['budget'][i], df['duration_days'][i]]
        data.append(temp)

data_df = pd.DataFrame(data, columns=['event','budget','duration_days'])
data_df
event budget duration_days
0 A 123
1 A 123
2 A 123
3 A 123
4 A 123
5 A 123
6 B 433
7 B 433
8 B 433

The easiest way of doing this is probably to first convert the dataframe back to a list of rows, then use base python syntax to repeat each row n times, and then convert that back to a dataframe:

import pandas as pd

df = pd.DataFrame({
    "event": ["A","B","C","D"],
    "budget": [123, 433, 1000, 1299],
    "duration_days": [6, 3, 4, 2]
})

pd.DataFrame([
    row # select the full row
    for row in df.to_dict(orient="records") # for each row in the dataframe
    for _ in range(row["duration_days"]) # and repeat the row for row["duration"] times
])

Which gives the following dataframe:

event budget duration_days
A 123 6
A 123 6
A 123 6
A 123 6
A 123 6
A 123 6
B 433 3
B 433 3
B 433 3
C 1000 4
C 1000 4
C 1000 4
C 1000 4
D 1299 2
D 1299 2

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.