What should I do with the NaN values on this stock quote data?

I concatenated 3 stock quote data-frames all with date-time indexes. However, they differ in starting dates so the resulting data-frame contains NaN values for the stock quotes with more recent starting dates.

Should I just drop the rows with NaN and start the new data frame with the row where all have values or is there a way to fill them up? I'm planning on using the data to train a neural network that predicts future stock quotes.

Topic data-wrangling data time-series

Category Data Science


Filling ups the nan based on the quote that already started is possible with time series models, but it won't bring new information that you can learn on. At best it will obfuscate what the model has learned on that period.

Then there is the more general question about the performance : does the model perform better with or without that period of reconstructed data ? Only you can answer that by working it up... But I am not sure you want an answer to that question. The way to go imo is to just drop that period and see if the performance fits your need. If it isn't the case then you should consider some ways to add up more data.

In this cases i'd suggest either to just drop the NaN if you want to work with neural nets or work with a method that accept Nan as inputs (Xgboost for exemple).

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.