machine-translation

where to start in natural language processing for a language

AMAR BESSALAH

2022年5月19日 11:06

My native language is a regional language and few people speak it. I have some assignements in a machine learning course and i was thinking about doing some natural languge processing on my native language but i don't know where to start since there is almost no research about this language ( no corpus , no research papers , ... ) and i'm new to machine learning. I want to start doing everything from bottom and i want to do …

Topic: speech-to-text machine-translation nlp

Category: Data Science

Is it possible feed BERT to seq2seq encoder/decoder NMT (for low resource language)?

NLP Dude

2022年5月19日 03:02

I'm working on NMT model which the input and the target sentences are from the same language (but the grammar differs). I'm planning to pre-train and use BERT since I'm working on small dataset and low/under resource language. so is it possible to feed BERT to the seq2Seq encoder/decoder?

Topic: bert sequence-to-sequence deep-learning machine-translation machine-learning

Category: Data Science

Attention network without hidden state?

JMRC

2022年5月8日 04:07

I was wondering how useful the encoder's hidden state is for an attention network. When I looked into the structure of an attention model, this is what I found a model generally looks like: x: Input. h: Encoder's hidden state which feeds forward to the next encoder's hidden state. s: Decoder's hidden state which has a weighted sum of all the encoder's hidden states as input and feeds forward to the next decoder's hidden state. y: Output. With a process …

Topic: attention-mechanism rnn machine-translation machine-learning

Category: Data Science

How to implement Early stopping in Neural Machine Translation with Attention or Transformers?

Prateek Coder

2022年3月11日 00:15

I am trying to implement early stopping to my model where I am performing Machine Translation using Seq2Seq with attention. I am mostly used to writing my own models in steps, something like this: for activation in activations: for layer1 in layers1: for optimizer in optimizers: # define model model_vanilla_lstm = Sequential() model_vanilla_lstm.add(LSTM(layer1, activation=activation, input_shape=(n_step, n_features))) model_vanilla_lstm.add(Dense(1)) #compile model model_vanilla_lstm.compile(optimizer=optimizer, loss='mse') #Early Stopping earlyStop=EarlyStopping(monitor="val_loss",mode='min',patience=5) # fit model history = model_vanilla_lstm.fit(X, y, epochs=epoch, validation_data=(X_test,dataset_test['Close']) , verbose=1, callbacks=[earlyStop]) #Summary of the model …

Topic: early-stopping attention-mechanism sequence-to-sequence tensorflow machine-translation

Category: Data Science

What is Bit Per Character?

Ashwin Geet D'Sa

2022年2月19日 06:04

What is Bits per Character (bpc) metric which has been used to measure the model accuracy with reference to text8 and enwiki8 datasets. I encountered the term bpc in transformer -XL paper here. How different is it from the perplexity as a metric?

Topic: transformer metric machine-translation neural-network

Category: Data Science

For an LSTM-based seq2seq model, is reversing the input still necessary or advised when using attention?

Hank

2022年2月18日 10:20

The original seq2seq paper reversed the input sequence and cited multiple reasons for doing so. See: Why does LSTM performs better when the source target is reversed? (Seq2seq) But when using attention, is there still any benefit to doing this? I imagine since the decoder has access to the encoder hidden states at each time step, it can learn what to attend to and the input can be fed in the original order.

Topic: attention-mechanism sequence-to-sequence lstm machine-translation

Category: Data Science

WMT: What are the differences of WMT14, WMT15 and WMT16 datasets?

Ramón Wilhelm

2022年2月16日 15:26

Each year, the Workshop on Statistical Machine Translation (WMT) holds a conference that focuses on new tasks, papers, and findings in the field of machine translation. Let's say we are talking about the parallel dataset Newscommentary. There is the Newscommentary in WMT14, WMT15, WMT16 and so on. How much does the dataset differ from each conference? Is it possible to read this somewhere?

Topic: machine-translation dataset nlp

Category: Data Science

Self Attention vs LSTM with Attention for NMT

mashrivas

2022年2月16日 02:01

I am trying to compare the A: Transformer-based architecture for Neural Machine Translation (NMT) from the Attention is All You Need paper, with B: an architecture based on Bi-directional LSTM's in the encoder coupled with a unidirectional LSTM in the decoder, which attends to all the hidden states of the encoder, creates a weighted combination and uses this along with decoder (unidirectional) LSTM output to produce final output word. My question is what might be the advantages of Architecture A …

Topic: attention-mechanism lstm machine-translation

Category: Data Science

Pytorch build_vocab_from_iterator giving vocabulary with very few words

k-c

2022年2月8日 13:34

I am trying to build a translation model in pytorch. Following this post on pytorch I downloaded the multi30k dataset and spacy models for English and German. python -m spacy download en python -m spacy download de import torchtext import torch from torchtext.data.utils import get_tokenizer from collections import Counter from torchtext.vocab import Vocab, build_vocab_from_iterator from torchtext.utils import download_from_url, extract_archive import io url_base = 'https://raw.githubusercontent.com/multi30k/dataset/master/data/task1/raw/' train_urls = ('train.de.gz', 'train.en.gz') val_urls = ('val.de.gz', 'val.en.gz') test_urls = ('test_2016_flickr.de.gz', 'test_2016_flickr.en.gz') train_filepaths = [extract_archive(download_from_url(url_base + …

Topic: pytorch machine-translation nlp python

Category: Data Science

Why does Bahdanau Attention Have to be Causal?

Della

2021年12月27日 13:05

Using the Bahdanau attention layer on Tensorflow for time series prediction, although conceptually it is similar to NLP applications. This is how the minimal example code for a single layer looks like. import tensorflow as tf dim=7 Tq=5 # Number of future time steps to predict Tv=13 # Number of historic lag timesteps to consider batch_size=2**4 query=tf.random.uniform(shape=(batch_size, Tq, dim)) value=tf.random.uniform(shape=(batch_size, Tv, dim)) key=tf.random.uniform(shape=value.shape) layer=tf.keras.layers.AdditiveAttention(use_scale=True, causal=True) output, score=layer(inputs=[query, value, key], return_attention_scores=True) The score obtained in the last line seems to be …

Topic: bahdanau attention-mechanism lstm machine-translation time-series

Category: Data Science

BPE vs WordPiece Tokenization - when to use / which?

vgoklani

2021年11月30日 16:21

What's the general tradeoff between choosing BPE vs WordPiece Tokenization? When is one preferable to the other? Are there any differences in model performance between the two? I'm looking for a general overall answer, backed up with specific examples.

Topic: transformer sentiment-analysis machine-translation nlp machine-learning

Category: Data Science

Questions of understanding - Fast Lexically Constrained Decoding with Dynamic Beam Allocation for Neural Machine Translation

Ramón Wilhelm

2021年11月28日 19:14

I'm currently analysing the paper Fast Lexically Constrained Decoding with Dynamic Beam Allocation for Neural Machine Translation (Post, Vilar 2018): https://arxiv.org/abs/1804.06609 I have understanding problems how the data is processed. For example: the paper is writing about beams, banks and hypothesises and I have no idea what these terms mean. How would you describe these terms and are there any tutorial sources you would recommend for understanding the dynamic beam allocation?

Topic: text-processing text-classification machine-translation neural-network nlp

Category: Data Science

Multi-Head attention mechanism in transformer and need of feed forward neural network

Zephyr

2021年10月14日 17:58

After reading the paper, Attention is all you need, I have two questions: 1. What is the need of a multi-head attention mechanism? The paper says that: "Multi-head attention allows the model to jointly attend to information from different representation subspaces at different positions" My understanding is that it helps in anaphora resolution. For example:- "The animal didn't cross the street because it was too ..... (tired/wide)". Here "it" can refer to animal or street based on the last word. …

Topic: deep-learning machine-translation neural-network machine-learning

Category: Data Science

Is there "Attention Is All You Need" implementation in Keras?

Anton

2021年10月11日 14:57

Has anyone seen this model's implementation using Keras? inb4: tensorflow, pytorch

Topic: keras deep-learning machine-translation nlp

Category: Data Science

Passing Dependency/Constituency trees to a Neural Machine Translator

Justin Cunningham

2021年10月10日 06:02

I am working on a project on Neural Machine Translation in the English-Irish domain. I am not an expert and have researched entirely on my own for a technology exhibition so apologies if my question is simple. I am trying to parse all of my English corpus to constituency trees. Of course, the format of a sentence when using the Stanford Parser is something like: (ROOT (S (NP (VBG cohabiting) (NNS partners)) (VP (MD can) (VP (VB make) (NP (NP …

Topic: pytorch machine-translation neural-network nlp

Category: Data Science

Training NMT models for noisy social media roman text

Gokul NC

2021年10月7日 09:53

I am trying to train an NMT model where the source side is roman text of Asian languages from social media, and target side is English. Note that since roman text is not native to Asia, the romanizations done by people to type on the Internet are very personal and hence a bit noisy, but easily intelligible to native speakers. The following is an example for writing a Hindi sentence in different ways: Vaise bhi mere paas jo bhi hai …

Topic: transformer tokenization machine-translation

Category: Data Science

Do we really need <unk> tokens?

G. Ramistella

2021年9月9日 04:47

I am wondering, do we really need <unk> tokens? Why do we limit our vocabulary? Is it for speed? Accuracy? If we disable all limitations, what do you predict happens?

Topic: sequence-to-sequence lstm machine-translation machine-learning

Category: Data Science

Paraphrasing a sentence and changing the tone of it

Loukik

2021年8月18日 12:36

I am trying to make a model that is capable of translating a sentence into a new and a better form. I would like the model to change the tone and also give it some character. I am using this in my web app UI, simply allowing the users to witness new description as they refresh the page. For example, "You are logged out" -> "Looks like you have logged out". Something of such sort, any idea on this?

Topic: transformer machine-translation nlp machine-learning

Category: Data Science

Algorithm to parse PSD into html/XML?

TGW

2021年7月30日 18:15

I have been working on a project and we were trying to convert a PSD (Adobe Photoshop) file to a HTML for web applications as well as a Layout XML for android. We worked our way to generate basic skeletal html/xml but hit a wall for complex scenarios such as identifying separate divs and components. Our initial approach was to standardize the PSD and get metadata about each component from PSD but due to it's limitations we could only add …

Topic: machine-translation machine-learning

Category: Data Science

Why is the decoder not a part of BERT architecture?

Hrishikesh Athalye

2021年7月28日 05:34

I can't see how BERT makes predictions without using a decoder unit, which was a part of all models before it including transformers and standard RNNs. How are output predictions made in the BERT architecture without using a decoder? How does it do away with decoders completely?

Topic: bert attention-mechanism machine-translation nlp

Category: Data Science

About