Can a custom Transformer be used to transform X and y?

I am working with time series in sklearn, my goal is to have a Pipeline step that replaces each row with a window centered on that row (think convolution).

My problem here is that I need all rows (even unlabeled ones) in order to create the windows, but during fitting I want to drop all unlabeled rows. This requires access to both X and y in the transform process.

Can this be done with a custom Transformer? From what I gathered from the docs transform does only accept X:

def transform(self, X):
    print('transform called')

    # create windows from X and flatten them into rows
    X_ = unroll(X, parameters)

    # drop all rows that have no label
    X_ = X_[self.y_.labeled]   # --- I could store y during fitting

    # done
    return(X_)

This solves half of the problem, but the unlabeled rows have to be dropped from y too. It seems this is not possible in transform.

What would be the correct way to achieve this?

edit: I should add that I want this to be a pipeline step (and not preprocessing) is that this would allow me to treat the window size as a hyper parameter that could be optimized.

Topic scikit-learn

Category Data Science

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.