how many rows does pandas' interpolate consider?

How does pandas' DataFrame.interpolate() work in relation to the amount of rows it considers:

  1. is it just the row before the NaNs and the row right after?
  2. Or is it the whole DataFrame (how does that work at 1 million rows?)
  3. Or another way (please explain)

each of the methods is relevant.

‘linear’: Ignore the index and treat the values as equally spaced. This is the only method supported on MultiIndexes.
                
                    ‘time’: Works on daily and higher resolution data to interpolate given length of interval.
                
                    ‘index’, ‘values’: use the actual numerical values of the index.
                
                    ‘pad’: Fill in NaNs using existing values.
                
                    ‘nearest’, ‘zero’, ‘slinear’, ‘quadratic’, ‘cubic’, ‘spline’, ‘barycentric’, ‘polynomial’: Passed to scipy.interpolate.interp1d. These methods use the numerical values of the index. 
                                                                                                               Both ‘polynomial’ and ‘spline’ require that you also specify an order (int), e.g. df.interpolate(method='polynomial', order=5).
                
                    ‘krogh’, ‘piecewise_polynomial’, ‘spline’, ‘pchip’, ‘akima’: Wrappers around the SciPy interpolation methods of similar names. See Notes.
                    
                    ‘from_derivatives’: Refers to scipy.interpolate.BPoly.from_derivatives which replaces ‘piecewise_polynomial’ interpolation method in scipy 0.18.

Topic data interpolation pandas python data-cleaning

Category Data Science


DataFrame.interpolateFill NaN values using an interpolation method.

In the examples below, it consider the element before the nan value to interpolate.

s = pd.Series([np.nan, "single_one", np.nan,
               "fill_two_more", np.nan, np.nan, np.nan,
               4.71, np.nan])
s
0              NaN
1       single_one
2              NaN
3    fill_two_more
4              NaN
5              NaN
6              NaN
7             4.71
8              NaN
dtype: object
s.interpolate(method='pad', limit=2)
0              NaN
1       single_one
2       single_one
3    fill_two_more
4    fill_two_more
5    fill_two_more
6              NaN
7             4.71
8             4.71
dtype: object

You can also limit the direction of filling by using limit_direction parameter.

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.