How to find duplicate rows in a column then find out if two cells in another column sum up to a third cell in an Excel tab in Python?
I need to find all duplicate rows (string values) in Name column and then find out if two numerical values in Amount column sum up to a third value also in the Amount column in an Excel tab in Pandas (Python)? There are two tabs in this worksheet. I'm referring to the second tab called Table2.
For example, in the table below, I have several duplicates in the Name column. But for Richard Madden duplicates, corresponding values in Amount table (-4000) + (-6000) equals (-10000). I need to delete the entire rows for -4000 and -6000 and leave the row for -10000.
Here’s the Excel table: https://i.stack.imgur.com/3n2vZ.png
Here's my code so far:
import pandas as pd
excel = pd.ExcelFile('/Users/user/Downloads/DSR-Table.xlsx')
df1 = pd.read_excel(excel, 'Table2')
dfObj = pd.DataFrame(df1, columns=['Name'])
duplicateRowsDF = dfObj[dfObj.duplicated()]
Category Data Science