Can a custom Transformer be used to transform X and y?
I am working with time series in sklearn, my goal is to have a Pipeline
step that replaces each row with a window centered on that row (think convolution).
My problem here is that I need all rows (even unlabeled ones) in order to create the windows, but during fitting I want to drop all unlabeled rows. This requires access to both X and y in the transform process.
Can this be done with a custom Transformer
? From what I gathered from the docs transform
does only accept X:
def transform(self, X):
print('transform called')
# create windows from X and flatten them into rows
X_ = unroll(X, parameters)
# drop all rows that have no label
X_ = X_[self.y_.labeled] # --- I could store y during fitting
# done
return(X_)
This solves half of the problem, but the unlabeled rows have to be dropped from y too. It seems this is not possible in transform.
What would be the correct way to achieve this?
edit: I should add that I want this to be a pipeline step (and not preprocessing) is that this would allow me to treat the window size as a hyper parameter that could be optimized.
Topic scikit-learn
Category Data Science