RNN to model DNA sequencing classification

I have a DNA sequence dataset each mapped to a certain class. e,g TCAGCCGAGAGCTCATCGATCGTACGT 2 ATGCAGTGCATCGATCGATCGTAGAAC 3 Where the number after the sequence specifies the type of protein this sequence belongs to. So my question can I use KMers and 1-HOT-Coding to classify these sequences through biLSTM. Or this is not a possible concept and I would appreciate your feedback and suggestions on this task as I am new to Deep Learning. Thank you.
Category: Data Science

Impact of varying sequence length in ensemble GRU model

I am using ensemble gru for my project and keeping different cell sizes for different models !For example, first gru model is of size 16 and the second is of 8 and 4 for the third model. The model is running well but I don't see any difference in the results in keeping same unit size or different. can anyone explain the impact of varying unit size for ensemble gru. ? it would be great if answer is given with …
Topic: gru lstm
Category: Data Science

How to add a Decoder & Attention Layer to Bidirectional Encoder with tensorflow 2.0

I am a beginner in machine learning and I'm trying to create a spelling correction model that spell checks for a small amount of vocab (approximately 1000 phrases). Currently, I am refering to the tensorflow 2.0 tutorials for 1. NMT with Attention, and 2. Text Generation. I have completed up to an encoding layer but currently I am having some issue matching up the shape of the following layers (decoder and attention) with the previous (encoder). The encoder in the …
Category: Data Science

Why is the variance of my model predictions much smaller than the training data?

I trained a GRU model on some data and then created a bunch of predictions on a test set. The predictions are really bad, as indicated by a near zero R2 score. I notice that the variance of the model predictions are much smaller than the actual training data. i.e it seems like the model is overfit to the mean: But why is this? I made sure to stop training/use hyperparameters where there was no model overfitting, so why are …
Category: Data Science

How to best chose a model and hyper parameters for an unbalanced small dataset

I'm working on a deep learning problem where I was told to use LSTM or GRU to predict if a patient will die within 4hrs in the future. For the dataset, each patient has several measurements that were measured at different times. Please refer to the image attached to this post. The real measurements were changed to use a more constant values like very low(-3,-2), low(-1), no measurement/normal(0), and so on. The dataset is very unbalanced. I only have 12% …
Category: Data Science

When training on a paragraph containing large number of words,does GRU end up predicting repeated outputs

Is it correct,that if we train GRUs on paragraphs containing a large number of words(say 10,000),then the GRU will end up predicting repeated outputs or in worst case,the predicted output will not have much variance. Apart from these points mentioned above,what other problems might GRU suffer from when training on documents containing large amount of words.
Topic: gru
Category: Data Science

Keras RNN (batch_size

I created an RNN model for text classification with the LSTM layer, but when I put the batch_size in the fit method, my model trained on the whole batch instead of just the mini-batch _size. This also happened when I used GRU and Bidirectional layer instead of LSTM. What could be wrong? def create_rnn_lstm(): input_layer = layers.Input((70, )) embedding_layer = layers.Embedding(len(word_index) + 1, 300, weights=[embedding_matrix], trainable=False)(input_layer) embedding_layer = layers.SpatialDropout1D(0.3)(embedding_layer) lstm_layer = layers.LSTM(100)(embedding_layer) output_layer1 = layers.Dense(70, activation="relu")(lstm_layer) output_layer1 = layers.Dropout(0.25)(output_layer1) output_layer2 …
Topic: gru lstm keras rnn
Category: Data Science

GRU and LSTM does not "take risk" predicting

I tested LSTM and GRU models to predict the exchange rate between currencies. I do not take the raw price but a the delta with the previous day, so the data is stationnary around zero. My problem is that my model always predict really close-to-zero values, like if it minimize the risk and does not want to guess wrong. It may be because it underfit but I wanted to be sure that it is not a common issue that I …
Category: Data Science

Converting a speech recognition model from CNNs to GRUs

I am trying to convert the simple audio recognition example from TensorFlow to use GRUs instead of CNNs. The idea is to classify an audio clip onto a set of 8 labels: ['go', 'down', 'up', 'stop', 'yes', 'left', 'right', 'no'] The original code builds a model as follows: norm_layer = preprocessing.Normalization() norm_layer.adapt(spectrogram_ds.map(lambda x, _: x)) model = models.Sequential([ layers.Input(shape=input_shape), preprocessing.Resizing(32, 32), norm_layer, layers.Conv2D(32, 3, activation='relu'), layers.Conv2D(64, 3, activation='relu'), layers.MaxPooling2D(), layers.Dropout(0.25), layers.Flatten(), layers.Dense(128, activation='relu'), layers.Dropout(0.5), layers.Dense(num_labels), ]) The input shape is …
Topic: gru tensorflow
Category: Data Science

LSTM / GRU weights during test time

I am working on a historic time series dataset and using RNN, LSTM, GRU models, and I didn't find an answer if in test time, the h (or h, c) weights should be zeors for each batch? If the weights should be zero, what they should be? the last updated weights from the training? Thanks
Category: Data Science

Custom GRU With 3D Spatial Convolution Layer In Keras

I am trying to implement a custom GRU model that is shown in this paper 3D-R2N2 The GRU pipeline looks like: The original implementation is theano based and I am trying to apply the model in tf2/Keras. I have tried to create a custom GRU Cell from keras recurrent layer. The input to the GRU model is of shape (Batch Size,Sequence,1024) and the output is (Batch Size, 4, 4, 4, 128). I have issues implementing the convolution layer present in …
Category: Data Science

When to use GRU over LSTM?

The key difference between a GRU and an LSTM is that a GRU has two gates (reset and update gates) whereas an LSTM has three gates (namely input, output and forget gates). Why do we make use of GRU when we clearly have more control on the network through the LSTM model (as we have three gates)? In which scenario GRU is preferred over LSTM?
Category: Data Science

LSTM / GRU prediction with hidden state?

I am trying to predict a value based on time series by series of 24 periods (the 25th period) While training I have a validation set with I babysit the training (RMSE) and each epoch, eval the validation: Receive errors as: Train RMSE: 3.02 Validation RMSE: 5.65 Validation r_squared: 0.75 However, when I evaluate the test (in the same size of the validation) I receive very bad results by: model.eval() x_test = x_test.cuda() y_test_pred, hidden = model(x_test) Test RMSE: 8.18 …
Category: Data Science

Using GRU with FeedForward layers in Python

I'm trying to reproduce the codes in this paper here for the multi-labeling problem (11 classes), which is using 1- Embedding layer 2- GRU 3- two Feed forward Layers with the ReLU activation function 4- sigmoid unit. I've tried to run the codes, but it is showing the following error: ValueError: Error when checking target: expected dense_5 to have 3 dimensions, but got array with shape (6838, 11) Edit: The error is fixed. I changed the "return_sequences" to False, and …
Category: Data Science

TensorFlow / Keras: What is stateful = True in LSTM layers?

Could you elaborate on this argument? I found the brief explanation from the docs unsatisfying: stateful: Boolean (default False). If True, the last state for each sample at index i in a batch will be used as initial state for the sample of index i in the following batch. Also, when stateful = True is to be chosen? What are practical cases of its use?
Category: Data Science

Wiggle in the initial part of an LSTM prediction

I working on using LSTMs and GRUs to make time series predictions. For the most part the predictions are pretty good. However, there seems to be a wiggle (or initial up-then-down) before the prediction settles out similar to the left side of this figure from another question. In my case, it is also causing a slight offset. Does anyone have any idea why this might be the case? Below are the shapes of the training and test sets, as well …
Category: Data Science

GRU learns small-scale features, but misses large scales

Playing around with weather data, I have set up a simple RNN with one layer of GRUs. It is trained to recover the temperature of the next day, given weather data of the last 5 days, each with 1-hour intervals. What I find peculiar is that after training several epochs, the result is something that has a lot of the small scale features of the data, but it lacks the large-scale structure. Frequently, there seems to be just an offset …
Topic: gru rnn
Category: Data Science

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.