dataframe.columns.difference() use

Question

dataframe.columns.difference() use

Parth S.

2019年3月2日 01:31

I am trying to find the working of dataframe.columns.difference() but couldn't find a satisfactory explanation about it. Can anyone explain the working of this method in detail?

Topic dataframe difference pandas

Category Data Science

bkshi · Accepted Answer · 2019年3月1日 04:55

The function dataframe.columns.difference() gives you complement of the values that you provide as argument. It can be used to create a new dataframe from an existing dataframe with exclusion of some columns. Let us look through an example:

In [2]: import pandas as pd

In [3]: import numpy as np

In [4]: df = pd.DataFrame(np.random.randn(5, 4), columns=list('ABCD'))

In [5]: df
Out[5]: 
          A         B         C         D
0 -1.023134 -0.130241 -0.675639 -0.985182
1  0.270465 -1.099458 -1.114871  3.203371
2 -0.340572  0.913594 -0.387428  0.867702
3 -0.487784  0.465429 -1.344002  1.216967
4  1.433862 -0.172795 -1.656147  0.061359

In [6]: df_new = df[df.columns.difference(['B', 'D'])]

In [7]: df_new
Out[7]: 
          A         C
0 -1.023134 -0.675639
1  0.270465 -1.114871
2 -0.340572 -0.387428
3 -0.487784 -1.344002
4  1.433862 -1.656147

The function returns as output a new list of columns from the existing columns excluding the ones given as arguments. You can also check it:

In [8]: df.columns.difference(['B', 'D'])
Out[8]: Index(['A', 'C'], dtype='object')

I suggest you to take a look at the official documentation here.

dataframe.columns.difference() use

About