sequence-to-sequence

When using padding in sequence models, is Keras validation accuracy valid/ reliable?

Amir Jalilifard

2022年6月2日 16:17

I have a group of non zero sequences with different lengths and I am using Keras LSTM to model these sequences. I use Keras Tokenizer to tokenize (tokens start from 1). In order to make sequences have the same lengths, I use padding. An example of padding: # [0,0,0,0,0,10,3] # [0,0,0,0,10,3,4] # [0,0,0,10,3,4,5] # [10,3,4,5,6,9,8] In order to evaluate if the model is able to generalize, I use a validation set with 70/30 ratio. In the end of each epoch …

Topic: sequence-to-sequence keras tensorflow language-model machine-learning

Category: Data Science

RNN to model DNA sequencing classification

Juliet Soujbel

2022年5月27日 15:14

I have a DNA sequence dataset each mapped to a certain class. e,g TCAGCCGAGAGCTCATCGATCGTACGT 2 ATGCAGTGCATCGATCGATCGTAGAAC 3 Where the number after the sequence specifies the type of protein this sequence belongs to. So my question can I use KMers and 1-HOT-Coding to classify these sequences through biLSTM. Or this is not a possible concept and I would appreciate your feedback and suggestions on this task as I am new to Deep Learning. Thank you.

Topic: sequence-to-sequence gru lstm rnn deep-learning

Category: Data Science

two different attention methods for seq2seq

DSKim

2022年5月22日 02:00

I see two different ways of applying attention in seq2seq: (a) the context vector (the weighted sum of encoder hidden states) fed into the output softmax, as shown in the diagram below. The diagram is from here. (b) the context vector fed into the decoder input as shown the diagram below. The diagram is from here. What are the pros and the cons of the two different approaches? Is there any paper comparing the two?

Topic: attention-mechanism sequence-to-sequence

Category: Data Science

Timeseries LSTM: does test data need to come after training data?

lodo

2022年5月21日 10:02

I have one single, very long time series. I want to train an LSTM to distinguish between two behaviours (A or B) at every timestep (sequence-to-sequence). Because the time series is very long, I plan to extract shorter, partially-overlapping subsequences and use each of them as one training input for the LSTM. In my train/validation/test split, do I have to use older subsequences for training and newer for validation and test? Or can I treat them as if they were …

Topic: sequence-to-sequence lstm rnn time-series

Category: Data Science

Keras: Softmax output into embedding layer

Physbox

2022年5月20日 12:07

I'm trying to build an encoder-decoder network in Keras to generate a sentence of a particular style. As my problem is unsupervised i.e. I don't have the ground truths for the generated sentences, I use a classifier to help during training. I pass the decoder's output into the classifier to tell me what style the decoded sentence is. The decoder outputs a softmax distribution which I was intending to feed straight into the classifier but I realised that it has …

Topic: sequence-to-sequence embeddings keras

Category: Data Science

Predicting next number in a sequence - data analysis

varun

2022年5月19日 21:05

I am a machine learning newbie and I am working on a project where I'm given a sequence of integers all of which are in the range 0 to 70. My goal is to predict the next integer in the sequence given the previous 5 integers in the same sequence. There isn't much more information on the sequence of integers itself (for example, how was the sequence obtained, etc). The following are the things I tried. The first thing that …

Topic: sequence-to-sequence lstm rnn time-series machine-learning

Category: Data Science

Is it possible feed BERT to seq2seq encoder/decoder NMT (for low resource language)?

NLP Dude

2022年5月19日 03:02

I'm working on NMT model which the input and the target sentences are from the same language (but the grammar differs). I'm planning to pre-train and use BERT since I'm working on small dataset and low/under resource language. so is it possible to feed BERT to the seq2Seq encoder/decoder?

Topic: bert sequence-to-sequence deep-learning machine-translation machine-learning

Category: Data Science

String together a set of tokens into a sequence

Deepak Saini

2022年5月14日 06:02

I have this problem scenario - Given a set of tokens, string them or a subset of the tokens together using stop words into a sequence. I am clear that I can have potentially infinite pre-training data for this problem. For example, given the set of tokens {cat, jump, mouse} - possible outputs might be: a. the cat jumped on a mouse, b. the cat and the mouse jumped, c. cats jump and so on... I am not sure if …

Topic: sequence-to-sequence machine-learning

Category: Data Science

How to add attention mechanism to my sequence-to-sequence architecture in Keras?

Fredrik

2022年5月10日 13:02

Based on this blog entry, I have written a sequence to sequence deep learning model in Keras: model = Sequential() model.add(LSTM(hidden_nodes, input_shape=(n_timesteps, n_features))) model.add(RepeatVector(n_timesteps)) model.add(LSTM(hidden_nodes, return_sequences=True)) model.add(TimeDistributed(Dense(n_features, activation='softmax'))) model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy']) model.fit(X_train, Y_train, epochs=30, batch_size=32) It works reasonably well, but I intend to improve it by applying attention mechanism. The aforementioned blog post includes a variation of the architecture with it by relying on a custom attention code, but it doesn't work my present TensorFlow/Keras versions, and anyway, to my …

Topic: attention-mechanism sequence-to-sequence lstm keras

Category: Data Science

How to pad real-valued sequences

Aechlys

2022年5月4日 12:15

I have several sequences of univariate real-valued time-series data. The sequences are of different lengths and right now I cannot batch them and feed them to a network. What is the correct procedure to pad these sequences? Is it even possible in this case since I can't use any number as a special symbol? UPDATE 1 I'm working with arbitrary univariate time-series data (not related to one specific domain, unbounded range). To give example of one such a series consider …

Topic: sequence-to-sequence tensorflow

Category: Data Science

Is it possible to target a specific output length range with BART seq2seq?

Ruben Kruepper

2022年4月16日 13:04

I'm currently working on an extractive summary model based on Facebook's BART model. Consistent absolute length output would be highly desirable. The problem is that input length may vary wildly. That is to say, creating the training data, the instructions look like this: Take the input text (a news article) and start (recursively) deleting examples, excess details, unnecessary background information, quotes, etc. Once your summary has less than 90 words, stop deleting stuff. Fix up the text format to match …

Topic: transformer sequence-to-sequence nlp

Category: Data Science

IndexError: list index out of range

Kahina

2022年4月14日 03:06

I'm implementing a sequence-2-sequence model with RNN-VAE architecture, and I use an attention mechanism. I have problem in the decoder part. I'm struggling with this error: IndexError: list index out of range When I run this code: decoder_inputs = Input(shape=(len_target,)) decoder_emb = Embedding(input_dim=vocab_out_size, output_dim=embedding_dim) decoder_lstm = LSTM(units=units, return_sequences=True, return_state=True) decoder_lstm_out, _, _ = decoder_lstm(decoder_emb(decoder_inputs), initial_state=encoder_states) print("enc_outputs", encoder_outputs.shape) # ==> (?,256) print("decoder_lstm_out", decoder_lstm_out.shape)# ==> (?,12,256) print("zzzzzz", z.shape) # ==> (?,256) attn_layer = AttentionLayer(name='attention_layer') attn_out, attn_states = attn_layer([z,z], decoder_lstm_out) The error is …

Topic: sequence-to-sequence keras tensorflow autoencoder python

Category: Data Science

Atomic tasks from a complex task using NLP

Loukik

2022年4月9日 16:59

I have a problem statement when I need to find all the tasks that the server had to do based on a complex task. Example, in a 3D modeling scenario, if the model is queried with a complex task such as "rotate" then the response should be something like: Select the object Rotate the object Can we make this model learn on data that is manually prepared and then tune the model such that it can predict more complex tasks?

Topic: sequence-to-sequence text-mining nlp

Category: Data Science

Seq2Seq loss function

Văn Hiếu Lê

2022年4月6日 16:45

I was reading the paper neural_approach_conversational_ai.pdf. And in the section Seq2Seq for Text Generation there is a formula that i feel a bit wrong [1]: https://i.stack.imgur.com/sX0it.png Can someone help me confirm this formula?

Topic: sequence-to-sequence rnn neural-network

Category: Data Science

Multi-step forecasts of factory production data using a Seq2Seq Encoder-Decoder Model with Attention

InvestingScientist

2022年4月1日 22:08

I am attempting to use a Seq2Seq model to make forecasts of factory production data using an Encoder-Decoder model augmented with Attention. I have become a little stuck as the output of the model seems to be a constant and has the same size sequence length as the input, where in fact I would like to be able to specify that say I want to forecast 3 (or any number of) months into the future. Here is 2 diagrams of …

Topic: sequence-to-sequence time-series machine-learning

Category: Data Science

Multioutput prediction using LSTM encoder decoder with Attention

Sukhmani Kaur Thethi

2022年3月30日 00:02

(I am working on Jupter notebook with python version 3.6.12, running Tensorflow 2.4.0 version.) I have a dataset that consists of 5 input features and 3 output features (that requires to be predicted). My features are string values of integers and looks like as follows: Input (training) features: A B C D E 57 00101 01000 01001 01000 00110 203 00111 01001 01000 01000 00110 559 00010 01001 01001 01000 00110 247 00101 01001 01001 01000 00110 1111 00111 01001 …

Topic: attention-mechanism sequence-to-sequence lstm autoencoder python

Category: Data Science

Attention weights - change during learning and prediction

Sandeep Bhutani

2022年3月22日 11:03

Assume a simple LSTM Followed by Attention layer or a full transformer architecture. The attention weights are learnt during training, which get multiplied with keys, queries and values. Please correct if my above understanding is wrong or below question. The question is, when these weights of attention layer gets changed and when not. Do attention layer weights change for each input in sequence? (I assume no, but please confirm) Do attention layer weights get frozen during prediction (inference)? Or these …

Topic: transformer attention-mechanism sequence-to-sequence

Category: Data Science

Calculating confidence score in NER

Saikat Bhattacharya

2022年3月19日 03:05

I am working on a problem on Named Entity Recognition. Given a text, my model is detecting the Named Entities and extracting that info for the end-user. Now the ask is end-user needs a confidence score along with the extracted entity. For example, the given text is: XYZ Bank India Limited is a good place to invest your money - Our model is detecting XYZ Bank as an Org, but India as a Location (which is wrong - the whole …

Topic: sequence-to-sequence named-entity-recognition deep-learning nlp python

Category: Data Science

How to implement Early stopping in Neural Machine Translation with Attention or Transformers?

Prateek Coder

2022年3月11日 00:15

I am trying to implement early stopping to my model where I am performing Machine Translation using Seq2Seq with attention. I am mostly used to writing my own models in steps, something like this: for activation in activations: for layer1 in layers1: for optimizer in optimizers: # define model model_vanilla_lstm = Sequential() model_vanilla_lstm.add(LSTM(layer1, activation=activation, input_shape=(n_step, n_features))) model_vanilla_lstm.add(Dense(1)) #compile model model_vanilla_lstm.compile(optimizer=optimizer, loss='mse') #Early Stopping earlyStop=EarlyStopping(monitor="val_loss",mode='min',patience=5) # fit model history = model_vanilla_lstm.fit(X, y, epochs=epoch, validation_data=(X_test,dataset_test['Close']) , verbose=1, callbacks=[earlyStop]) #Summary of the model …

Topic: early-stopping attention-mechanism sequence-to-sequence tensorflow machine-translation

Category: Data Science

What happens when the length of input is shorter than length of output in transformer architecture?

Damian Grzanka

2022年2月24日 07:23

Given standard transformer architecture with encoder and decoder. What happens when the input for the encoder is shorter than the expected output from the decoder? The decoder is expecting to receive value and key tensors from the encoder which size is dependent on the amount of input token. I could solve this problem during training by padding input and outputs to the same size. But how about inference, when I don't know the size of the output? Should I make …

Topic: transformer sequence-to-sequence nlp

Category: Data Science

About