Interpreting DataFrame.where() documentation

From examples outside of the documentation, I thought I understood the examples of the .where() method. Basically, it seems to be a another way to filter a dataframe.

However, when I checked the documentation itself for an example of how to use .where(), it was counterintuitive.

The documentation provides this example:

 df = pd.DataFrame({'A': [1, 2, 3],
                    'B': [4, 5, 6],
                    'C': [7, 8, 9]})

df.where(lambda x: x  4, lambda x: x + 10)

[output]: 
    A   B  C
0  11  14  7
1  12   5  8
2  13   6  9

It seems to me that this code should filter for all values greater than four. According to my logic,

df.where(lambda x: x  4, lambda x: x + 10) 

should add 10 to all values greater than 4, changing the output to

    A   B  C
0   1   4 17
1   2  15 18
2   3  16 19

Could someone please explain to me the error in my logic?

Topic methods pandas

Category Data Science


I assumed "where" meant "where condition is true, replace with X", but it's actually "Where this condition is true, KEEP the value". This is odd to me but I found a different documentation page that more clearly explained the .where() method.

As it turns out, the .mask() method would return what I expected from .where(). Specifically:

df.mask(lambda x: x > 4, lambda x: x + 10) 

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.