17

I have two data frames df1 and df2. For my analysis, I need to remove rows from df1 that have identical column values (Email) in df2?

>>df1
   First  Last  Email
0 Adam   Smith  email@email.com
1 John   Brown  email2@email.com
2 Joe    Max    email3@email.com
3 Will   Bill   email4@email.com

>>df2 First Last Email 0 Adam Smith email@email.com 1 John Brown email2@email.com

user3503711
  • 103
  • 3
a_a_a
  • 837
  • 2
  • 8
  • 11

2 Answers2

17

You can try this:

cond = df1['Email'].isin(df2['Email'])
df1.drop(df1[cond].index, inplace = True)

>>df1
    First   Last    Email
2   Joe     Max     email3@email.com
3   Will    Bill    email4@email.com
Mohit Motwani
  • 601
  • 1
  • 7
  • 23
9

Simpler to use isin() with dropna()

https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.isin.html

df1[~df1.isin(df2)].dropna()
Idodo
  • 191
  • 2
  • 3