Proper iteration over time series data for LSTM neural network

I’m using the supervised learning method with an LSTM network to predict forex prices. To achieve this I’m using deeplearning4j library but I doubt several points of my implementation.

I turned off the mini batch feature, then I created many trading indicators from forex data. The point is to provide random chunks of data to the neural network on every epoch and ensure that after every epoch the network state was cleaned.

To achieve this I created a dataset iterator which iterates over time series data from index N to N + EPOCH_SIZE. On every epoch a random starting index was generated between 0 and indicator data length. Then the iterator starts to iterate from the starting index. On every iteration the dataset iterator returns a dataset containing a single input data.

Another option is to return a dataset which contains a sequence of input data with length of EPOCH_SIZE.

On the other hand, at the start of each epoch, previous inputs from the previous epoch interfere with the output of the current prediction. Do you have any ideas to solve this problem?

Neural net configuration

public static MultiLayerNetwork buildNetwork(int nIn, int nOut, int windowSize) {

        MultiLayerConfiguration conf = new NeuralNetConfiguration.Builder()
                .seed(System.currentTimeMillis())
                .optimizationAlgo(OptimizationAlgorithm.STOCHASTIC_GRADIENT_DESCENT)
                .weightInit(WeightInit.XAVIER)
                .updater(Updater.RMSPROP)
                .miniBatch(false)
                .l2(25e-4)
                .list()
                .layer(0, new LSTM.Builder()
                        .nIn(nIn)
                        .nOut(256)
                        .activation(Activation.TANH)
                        .gateActivationFunction(Activation.HARDSIGMOID)
                        .dropOut(0.2)
                        .build())
                .layer(1, new LSTM.Builder()
                        .nIn(256)
                        .nOut(256)
                        .activation(Activation.TANH)
                        .gateActivationFunction(Activation.HARDSIGMOID)
                        .dropOut(0.2)
                        .build())
                .layer(2, new DenseLayer.Builder()
                        .nIn(256)
                        .nOut(32)
                        .activation(Activation.RELU)
                        .build())
                .layer(3, new RnnOutputLayer.Builder()
                        .nIn(32)
                        .nOut(nOut)
                        .activation(Activation.IDENTITY)
                        .lossFunction(LossFunctions.LossFunction.MSE)
                        .build())
                .backpropType(BackpropType.TruncatedBPTT)
                .tBPTTForwardLength(windowSize)
                .tBPTTBackwardLength(windowSize)
                .build();


        MultiLayerNetwork network = new MultiLayerNetwork(conf);
        network.init();
        return network;
    }

Dataset Iterator

    @Override
    public DataSet next() {

        INDArray observationArray = Nd4j.create(new int[]{1 , this.featureSize, this.windowSize}, 'f');
        INDArray labelArray = Nd4j.create(new int[]{1, PREDICTION_VALUES_SIZE, this.windowSize}, 'f');

        int windowStartOffset = this.seriesIndex;
        int windowEndOffset = windowStartOffset + this.windowSize;

        for (int windowOffset = windowStartOffset; windowOffset  windowEndOffset; windowOffset++) {

            int windowIndex = windowOffset - windowStartOffset;

            for (int featureIndex = ZERO_INDEX; featureIndex  this.featureSize; featureIndex++) {

                observationArray.putScalar(
                        new int[]{ZERO_INDEX, featureIndex, windowIndex},
                        this.dataProvider.data(windowOffset, featureIndex)
                );
            }
            labelArray.putScalar(new int[]{ZERO_INDEX, ZERO_INDEX, windowIndex},
                    this.dataProvider.pip(windowOffset + this.predictionStep)
            );
        }
        seriesIndex++;
        return new DataSet(observationArray, labelArray);
    }

Training

    public static final int EPOCHS = 500;
    public static final int EPOCH_SIZE = 300;
    public static final int WINDOW_SIZE = 20;
    public static final int PREDICTION_STEP = 1;

    public static void prepare(String network, String dataset) throws IOException {

        TradingDataProvider provider = new TradingDataProvider(CommonFileTools.loadSeries(dataset));

        TradingDataIterator dataIterator = new TradingDataIterator(provider, EPOCH_SIZE, WINDOW_SIZE, PREDICTION_STEP);
        MultiLayerNetwork net = LSTMNetworkFactory.buildNetwork(dataIterator.inputColumns(), dataIterator.totalOutcomes(), WINDOW_SIZE);

        long start;
        for (int i = 0; i  EPOCHS; i++) {
            start = System.currentTimeMillis();
            net.fit(dataIterator);

            logger.info(Epoch: {}, Score: {}, Duration: {} ms, i + 1, net.score(),
                    System.currentTimeMillis() - start
            );
        }

        File locationToSave = new File(network);
        ModelSerializer.writeModel(net, locationToSave, true);
        logger.info(Model saved);
        System.exit(0);
    }

Topic lstm normalization supervised-learning neural-network time-series

Category Data Science

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.