openai-gpt

Dataset with Multiple Choice Questions for fine tuning

futuredataengineer

2022年6月3日 19:35

I hope it's allowed to ask here, but I am looking for a dataset (the format is not that important) that is similar to SQuAD, but it also contains false answers to the questions. I wanna use it to fine tune GPT-3, and all I find is either MC questions based on a text, but with no distractors, or classical quizzes that have no context before each question. I have a code that generates distractors, and I can just plug …

Topic: openai-gpt data dataset nlp machine-learning

Category: Data Science

What's the right input for gpt-2 in NLP

yuqiong11

2022年5月24日 10:59

I'm fine-tuning pre-trained gpt-2 for text summarization. The dataset contains 'text' and 'reference summary'. So my question is how to add special tokens to get the right input format. Currently I'm thinking doing like this: example1 <BOS> text <SEP> reference summary <EOS> , example2 <BOS> text <SEP> reference summary <EOS> , ..... Is this correct? If so, a follow-up question would be whether the max-token-length(i.e. 1024 for gpt-2) means also the concatenate length of text and reference summary? Any comment …

Topic: openai-gpt transformer data-science-model nlp

Category: Data Science

Guide to Natural language Prompt programming for few-shot learning of Pretrained Language Models

vishal singh

2022年4月19日 18:43

I'm currently working on a project with the goal of producing AI content in the space of a content generation like blog writing, Instagram caption generation etc. Found the in-context few-shot learning capabilities of the GPT-3 quite useful but I'm unable to generate creative content consistently. It becomes boring and repetitive in nature after a few iterations. I came across the concept of knowledge probing of language models and have come to this understanding that writing better prompts can actually …

Topic: openai-gpt transformer text-generation deep-learning nlp

Category: Data Science

Conversational model returns empty string after a while

lte__

2022年4月16日 11:00

I've been experinmenting with Huggingface models and I've set up a chatbot with DialoGPT. It works pretty well, but after a while it stops answering and just returns empty strings. Before this it will start to give shorter and shorter answers. Any idea what can cause such a behavior? I'm using the medium-sized model with a max_length of 2000 and added a repetition_penalty=1.3, but other than that I didn't change any other parameters. I also add the previous message back …

Topic: huggingface openai-gpt transformer

Category: Data Science

How do I use GPT-J for document data extraction?

lowkey

2022年4月4日 04:26

I want to extract data from documents (native pdf's with English language) using GPT-J but without using it's API. I have searched all documentation regarding GPT-J but haven't came across anything related to this. This article mentions that searching data is possible using GPT-J but that's all it mentions. Basically I want to extract text from documents using GPT-J without using the API. Any help/links/articles/videos would be helpful! Thanks for your time and help!

Topic: information-extraction openai-gpt python

Category: Data Science

What tokenizer does OpenAI's GPT3 API use?

Herman Autore

2022年3月30日 05:10

I'm building an application for the API, but I would like to be able to count the number of tokens my prompt will use, before I submit an API call. Currently I often submit prompts that yield a 'too-many-tokens' error. The closest I got to an answer was this post, which still doesn't say what tokenizer it uses. If I knew what tokenizer the API used, then I could count how many tokens are in my prompt before I submit …

Topic: openai-gpt tokenization python-3.x

Category: Data Science

Conceptually, how to deal with facts and time in GPT-3 and Language Models

LaptopProfile

2021年12月27日 18:15

When exploring text generation using various large language models, I frequently come across generated text which presents facts which are plain out wrong. I am not talking about fake news or bias, rather I am talking about dated pieces of information which were once correct, but are no longer correct. When looking around for pros and cons of language models, I don't really see complaints about this as one of the greatest cons. When we finetune models, and with the …

Topic: openai-gpt language-model

Category: Data Science

BERT vs GPT architectural, conceptual and implemetational differences

Rnj

2021年11月30日 10:46

In the BERT paper, I learnt that BERT is encoder-only model, that is it involves only transformer encoder blocks. In the GPT paper, I learnt that GPT is decoder-only model, that is it involves only transformer decoder blocks. I was guessing whats the difference. I know following difference between encoder and decoder blocks: GPT Decoder looks only at previously generated tokens and learns from them and not in right side tokens. BERT Encoder gives attention to tokens on both sides. …

Topic: openai-gpt bert transformer nlp machine-learning

Category: Data Science

Is there a ubiquitous web crawler that can generate a good language-specific dataset for training a transformer?

Peter Elbert

2021年11月18日 23:15

It seems like a lot of noteworthy AI tools are being trained on datasets generated by web crawlers rather than human-edited, human-compiled corpora (Facebook Translate, GPT-3). In general, it sounds more ideal to have an automatic and universal way of generating a dataset. Is there any ubiquitous web crawler which does basically the same thing as Common Crawl but has a parameter for “language sought”? In other words, generate a web-crawled dataset in language X? (Background: I’d like to create …

Topic: openai-gpt crawling nlp

Category: Data Science

What is a good few-shot classifier function?

Peter Elbert

2021年11月17日 19:19

Is there any pre-written library or function which can receive a few examples of data values being classified and then extend that to new data values received?

Topic: openai-gpt transformer classification machine-learning

Category: Data Science

Is there an AI function that can provide the definition of a word or phase with reasonably good accuracy?

Peter Elbert

2021年11月17日 15:47

I would like to make use of a software function which can provide the definition of a word or phrase. These words and phrases are in the realm of common knowledge - objects like "DVD player", or specific places like "Canary Islands". I am pretty sure GPT-3 can do this because it's trained on the internet in general and Wikipedia, and it produces generally fluent language. However, I was curious if someone has already written this function and provided it …

Topic: openai-gpt nlp

Category: Data Science

What exactly are the parameters in GPT-3's 175 billion parameters?

user16584277

2021年9月22日 01:28

What exactly are the parameters in GPT-3's 175 billion parameters? Are these the words in text on which model is trained?

Topic: openai-gpt nlp

Category: Data Science

How to derive Evidence Lower Bound in the paper "Zero-Shot Text-to-Image Generation"?

p1p13

2021年6月12日 11:20

Can someone share the derivation of Evidence Lower Bound in this paper ? Zero-Shot Text-to-Image Generation The overall procedure can be viewed as maximizing the evidence lower bound (ELB) (Kingma & Welling, 2013; Rezende et al., 2014) on the joint likelihood of the model distribution over images x, captions y, and the tokens z for the encoded RGB image. We model this distribution using the factorization ${p_\theta,_\psi(x, y, z) = p_\theta(x | y, z)p_\psi(y, z)}$, which yields the lower bound: …

Topic: openai-gpt autoencoder probability expectation-maximization deep-learning

Category: Data Science

Training Objective of language model for GPT3

exteral

2021年4月17日 09:36

On page 34 of OpenAI's GPT-3, there is a sentence demonstrating the limitation of objective function: Our current objective weights every token equally and lacks a notion of what is most important to predict and what is less important. I am not sure if I understand this correctly. In my understanding, the objective function is to maximize the log-likelihood of the token to predict given the current context, i.e., $\max L \sim \sum_{i} \log P(x_{i} | x_{<i})$. Although we aim …

Topic: openai-gpt language-model nlp

Category: Data Science

How to take the keywords from the given dataset to train GPT-2 based chatbot?

Shivang Kohli

2021年3月16日 14:56

I am working with a dataset that contains Questions on various Events conducted by a college and the corresponding answers for the queries. I am using this dataset to train a GPT-2 355M model to create a chatbot where users can get their queries answered. But i am not getting good results and i feel that's because the questions in the dataset are in the " -Query " format. For example, Ques: "Cicada3302 - Do I need to have any …

Topic: openai-gpt chatbot nlp data-cleaning machine-learning

Category: Data Science

GPT-3 API Documentation?

mrgou

2021年3月12日 05:50

Has documentation of the GPT-3 API been made public? I would be interested in keeping myself up to speed on the API's capability.

Topic: openai-gpt nlp

Category: Data Science

Generate text using user-supplied keywords

Sameer Zahid

2021年3月9日 09:00

I've got a use case where I need to generate sentences based on a set of user supplied keywords. Here is an example of what I need: User input: End-User: Data Scientists Region: Middle East Country: UAE Solution: BigPanda Application: machine learning Benefits: lower costs and runtime Output (Curly-Brackets are just there to highlight): Learn how {data scientists} in the {Middle East} such as in the {UAE} are using {BigPanda} to streamline their {machine learning} processes to {lower costs and …

Topic: openai-gpt text-generation deep-learning language-model nlp

Category: Data Science

Best strategy for extracting specific structured data from unstructured sentences

Daan

2021年2月20日 00:45

Given a list of sentences like this: 4 to 5 hours over a period of 16 weeks 1st session: 2.0-2.5 hours & 2nd session: 1.5-2.0 hours Approximately 5-6 visits over the course of 5 months. Visit 1, 3, 5: about 1.5 hours. Visit 2, 4: short 15 visits over a period of approximately 74 weeks. You will come to the organization about 12 times, over a period of a little more than three years. Each visit will take from 3-6 …

Topic: openai-gpt tensorflow

Category: Data Science

How to access GPT-3, BERT or alike?

user305883

2021年1月22日 10:10

I am interested in accessing NLP models mentioned in scientific papers, to replicate some results and experiment. But I only see waiting lists https://openai.com/blog/openai-api/ and licenses granted in large commercial deals https://www.theverge.com/2020/9/22/21451283/microsoft-openai-gpt-3-exclusive-license-ai-language-research . How can a researcher not affiliated to a university or (large) tech company obtain access so to replicate experiments of scientific papers ? Which alternatives would you suggest to leverage on pre-trained data sets ?

Topic: pretraining openai-gpt nlp

Category: Data Science

Does BERT has any advantage over GPT3?

Bipin

2021年1月14日 03:39

I have read a couple of documents that explain in detail about the greater edge that GPT-3(Generative Pre-trained Transformer-3) has over BERT(Bidirectional Encoder Representation from Transformers). So am curious to know whether BERT scores better than GPT-3 in any particular area of NLP? It's quite interesting to note that OpenAI's GPT-3 is not open-sourced whereas tech behemoth Google's BERT is open-sourced. I felt OpenAI's stance and the hefty price tag for GPT-3 api is in stark contrast to its mission …

Topic: openai-gpt bert nlp

Category: Data Science

About