We fine-tuned BERT (bert-base-uncased) model with CoLA dataset for sentence classification task. The dataset is a mix of sentences with and without grammatical errors. The retrained model is then used to identify sentences with or without errors. Are there any other approaches we could make use of using BERT, other than building a classifier?
There are two closely related techniques in genetic programming. One of them is grammar based genetic programming (GBGP), which uses context free grammar to generate a derivative tree which represents the program. The other is grammatical evolution which uses something called genomes and codons, which is then mapped to a phenotype, a realization. The part where I get confused is, the phenotype realization can also be represented as a derivative tree. The codon to rule mapping is done through a …
The problem is best explained using an example, so please consider the sentence below: Made of airy cotton with a touch of stretch, the Pinafore Dress features a modern square neckline. Here, cotton (FABRIC) and square neckline (NECKLINE) are two important attributes in the sentence. What I need to do is to capture the word airy which is a detail of the fabric. FABRIC and NECKLINE can be successfully captured using NER, but NER is not working well when it …
I've a set of sentences in English language. I'm exploring ways to create a dataset of sentences with grammatical errors programmatically. The following options has been tried out randomly - identify verbs, propositions etc. by POS tagging and change the tense or remove them change the order of 2 or more words remove commas, colons, semi-colons etc. These are not always fool-proof. Are there any proven ways to approach this problem?
So I currently have a text pattern detection challenge to solve at work. I am trying to make an outlier detection algorithm for a database, for string columns. For example let's say I have the following list of strings: ["abc123", "jkj577", "lkj123", "uio324", "123123"] I want to develop an algorithm that would detect common patterns in the list of strings, and the indicate which strings are not in this format. For example, in the example above, I would like this …