How to pass input to deep learning models for Multiple choice question answering problem?

I'm currently working on a multiple-choice question answering system. The training set consists of a question, answer and 4 options and I need to predict the correct answer among 4 options. Sometimes there is one paragraph too, For example : 1.Which among the following is measured using a Vernier Caliper? [A] Dimensions [B] Time [C] Sound [D] Temperature Answer : A [Dimensions] Chapter text: [Book chapter related to Dimension, time, sound and temperature ] How to feed this input to …
Category: Data Science

NLP: Checking that answers to a question are correct

Question answering is a common topic within NLP, but my problem is a little different: rather than answering a question, I have a question, an (open-ended) answer, and what I want to check is if that answer is correct. For instance, if I have the question: "Have you done X?" I would like to be able to say that "Yes, I have done X." is correct, and "Yes, I have done Y." is incorrect. Going a step further, this should …
Category: Data Science

If Bert can handle only 512 inputs. Why you can provide such long contexts in QA Pipeline?

For example, I use Pipeline from Huggingface Transformers to use a QA model card like this. qa_pipeline = pipeline( "question-answering", model=model, tokenizer="wicharnkeisei/thai-bert-multi-cased-finetuned-xquadv1-finetuned-squad") And perform inference like this. qa_pipeline(question='What does the fox say?',context=open('context01.txt','r').read()) The parameter "context" which I passed a string from loaded context file that is very long (more than 1000 words or characters) How could the model handle this? And I wonder if there is a way to train this model with longer inputs?
Category: Data Science

global contrast normalization implementation

I'm trying to understand figure 12.1 in Goodfellow available here. I'm not able to reproduce figure 12.1, and I'm wondering what is it I'm missing. The denominator of equation 12.3 is a constant, and thus equation 12.3 reduces to a subtraction and a scaling. I'm finding it hard to believe that it will map the points to a sphere/circle as shown in figure 12.1. I'd expect something non-linear in order to do that. What am I missing? My code is: …
Category: Data Science

How did AILabs Team get such performance in the superglue benchmark?

If we look at superglue (https://super.gluebenchmark.com/leaderboard) benchmark leaderboard, it may seem that AILabs does not perform super well. But if we look at the model card (https://super.gluebenchmark.com/submission/PM5gv5ownvaXy1JhZNCi7nvkgDR2/-MVAj4j69EIbw0Y1XqY4 - broadcoverege) it appears that this model performs surprisingly well! How is it possible? Where can aI find the description of their method?
Category: Data Science

Closed Domain Question Answering which doesn't answer Questions

I've been exploring Closed Domain Question Answering Implementations which have been trained on SQuAD 2.0 dataset. Ideally, it should not answer questions which the context text corpus doesn't contain answers to. But while implementing such models using the Haystack repo or the FARM repo, I'm finding that it always answers these questions even when it shouldn't. Is there any implementation available that takes into account the fact that it shouldn't answer questions when it doesn't find a suitable answer. References: …
Category: Data Science

learn information from text and resolve problem using transformers

Let's imagine that we have some question, like this: "x multiplied by x equals 9. What is x?" For this easy question answer is +-3. I want to make AI model answer on questions like that. To train model we have only corpus Ex: "If some variable in power one multiplied by itself and equals to some digit, then we have to get root square from this digit in order to find x". Model has to learn some data from …
Category: Data Science

Question answering bot: EM>F1, does it make sense?

I am fine-tuning a Question Answering bot starting from a pre-trained model from HuggingFace repo. The dataset I am using for the fine-tuning has a lot of empty answers. So, after the fine tuning, when I'm evaluating the dataset by using the model just created, I find that the EM score is (much) higher than the F1 score. (I know that I must not use the same dataset for training and evaluation, it was just a quick test to see …
Category: Data Science

How do pretrained models using SQUAD dataset work on an any other dataset?

I see in some Kaggle contests people have used models pretrained in SQUAD dataset for building QA systems for the dataset given in the contest. How does this work? How can a pretrained model in a completely different dataset be used for any other dataset in building QA systems? Pre trained models make sense to me when used for image classification because same objects in different images may have same features. Similarly it also makes sense to be used for …
Category: Data Science

How to generate a WH question for a given answer?

There are limited info regarding question generation compared to question answering. I have a bunch of notes and highlights gathered from books, webpages which I want to memorize and also build an exam generation system for my students. Some github projects exist online that generate yes/no and Multiple choice questions but can't find any project to generate WH questions as follows. Answer: The Turkish alphabet consists of 29 letters. Question: How many alphabets are there in Turkish language? Answer: Data …
Category: Data Science

Pretrained models for Propositional logic

Are there any pretrained models which understand propositional logic? For example, the t5 model can do question-answering. Given a context such as "Alice is Bob's mother. Bob is Charlie's father", t5 can answer the question "Who is Charlie's father" correctly, but it cannot say "Who is Charlie's grandmother". Is there any model that has been/can be trained to do this kind of deduction and answer the question?
Category: Data Science

Addressing polysemy in NLP tasks

Looking for modern algorithms using NN Language Model implementations addressing polysemy in NLP tasks, including text classification, question answering and topic modeling. Transfer/Zero-short learning methods are most interesting to find. Any working solutions with BERT and Hugging Face Transformers libraries?
Category: Data Science

Best way to suggest answers given historical question-answer pairs

Many question-answering implementations focus on extracting information from large documents/corpora of text such as Wikipedia. I have access to a full chat log from the customer service of a large electronics company and I'm wondering if I can use the large amount of historical question-answer pairs in order to come up with useful answer suggestions for the customer care agents. The goal is thus to provide the customer care agent with useful answer suggestions. These could either be the result …
Category: Data Science

Answer to Question

Looking for a system which can generate answers to questions. Most systems and blogs posted on internet are on Question to answer but not on answer to question or paraphrasing or keyword to questions. Seq2Seq I tried and even after training for many hours the results were not making sense. Rule bases and template based systems like add What, who where etc to keywords have so many pitfalls. But if any system known giving decent outputs may also work. Kindly …
Category: Data Science

How to process list type questions in Question Answering task

How to generate question-answer-context triplets for questions with multiple answer strings? How to measure performance for it? For a question with one single answer, we generate one question-answer-context triplet, and calculate EM/F1 score. Then take average scores of the whole training set as the overall performance. For a list type question, is it correct to generate multiple triplets for each candidate answer string as separate records in the training set? Even they would share the same context and question. When …
Category: Data Science

Bert for QuestionAnswering input exceeds 512

I'm training Bert on question answering (in Spanish) and i have a large context, only the context exceeds 512, the total question + context is 10k, i found that longformer is bert like for long document, but there's no pretrained in spanish so, is there any idea get around bert. What i tried is: from transformers import BertConfig config=BertConfig.from_pretrained(BERT_MODEL_PATH) config.max_length=4000 config.max_position_embeddings=4000 config.output_hidden_states=True model = MyBertModel(config) but still gives me an error mismatch RuntimeError: Error(s) in loading state_dict for BertModel: size …
Category: Data Science

Select best answer from several existing ones for a question

After analyzing questions on a forum, a human support team has created a set of general answers, that can be used to provide basic answers on the forum. I am trying to build a system that: Selects best answer from this set of answers for a given question. How to do this? Estimates acceptability of such an answer. Which metrics to use? Using document embeddings, such as doc2vec to find similarity between question and answer does not solve the problem, …
Category: Data Science

Options to find the most similar question in a dataset of question-answer pairs?

I am building a chatbot that will only handle FAQs, but these FAQs are very specific to an organisation, so I cannot use any existing off-the-shelf solutions, or connect to question-answering APIs. I have a dataset which consists of questions, intents, and answers. Let's say there are 100 intents, which basically group questions into general categories (e.g. fee_payment). Each intent has 50 different specific answers (e.g. 'Fees are paid through the online portal' or 'Fees are due on the 1st …
Category: Data Science

Measuring quality of answers from QnA systems

I am having a question answering system which is using Seq2Seq kind of architecture. Actually it is a transformer architecture. When a question is asked it gives startposition and endposition of answer along with their logits. The answer is formed by choosing the best logits span and final probability is calculated by summing the start and end logits. Now the problem is, I have multiple answer and many times the good answer is at 2nd or 3rd place (after sorting …
Category: Data Science

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.