word - Geeks Mental

Keywords extraction for business rule text classification

Waroulolz

2022年1月16日 22:27

I would like to classify texts without using any ML model. My idea is to find a list of keywords that I would assign to each class. Then when I need to classify a new text, I can compare it with my list of keywords and count how many keywords for each class are in the text; the class with the most corresponding keywords would be my final prediction. Example of classification for this list of keywords: green : A …

Topic: word text classification

Category: Data Science

How to create table for reporting ANOVA results

mały_statystyczny

2021年10月24日 21:16

I would like to export tables for the following result for a repeated measure anova: Here the function which ANOVA test has been implemented fAddANOVA = function(data) data %>% ezANOVA(dv = .(value), wid = .(ID), within = .(COND)) %>% as_tibble() And here the commands to explore ANOVA statistics aov_stats <- df_join %>% group_by(signals) %>% mutate(ANOVA = map(data, ~fAddANOVA(.x))) %>% dplyr::select(., -data) %>% unnest(ANOVA) > aov_stats # A tibble: 12 x 4 # Groups: signals [12] signals ANOVA$Effect $DFn $DFd $F …

Topic: anova word data-table r

Category: Data Science

What model should i use to extract relation between words

Shantanu Vidwans

2021年5月11日 04:03

I want to create a ML model which would give a score from 0 to 1 which would signify the relation between them. I know about Relationship Extraction(RE) but that's more related with sentences based relation. Instead i want to input two words and that should output the relation between them and a input dataset being a lot of sentences.

Topic: word nlp machine-learning

Category: Data Science

Searching for a dataset that targets difficult words

Abir Pattnaik

2021年1月23日 20:03

I am trying to find a dataset in which dataset targets words that are difficult. I understand there would be different levels of difficulty for each individual , but if we considered an average individual, I want to detect the difficult words that would be present in a sentence. Example: Yes, may be today's Britains are not responsible for some of these reparations but the same speakers have pointed with pride to their foreign aid - you are not responsible …

Topic: word dataset nlp

Category: Data Science

Machine learning algorithms for forming Homophones from input dataset word

Prashant Akerkar

2020年9月15日 16:30

https://www.google.com/search?sxsrf=ALeKk01_SgA8G4UfNm4rOqku4yJBFvKhLw%3A1600154854621&source=hp&ei=5mxgX8ztI6KZ4-EPq-mL8Ak&q=homophones+example&oq=Homophones&gs_lcp=ChFtb2JpbGUtZ3dzLXdpei1ocBABGAEyBQgAELEDMgUIABCxAzICCAAyCAgAELEDEIMBMgUIABCxAzICCAAyAggAMgUIABCxAzoHCCMQ6gIQJzoECCMQJzoFCAAQkQI6CAguELEDEIMBOgUILhCxA1DkKliKSGDuUGgBcAB4AIAB6wGIAe8NkgEFMC44LjKYAQCgAQGwAQ8&sclient=mobile-gws-wiz-hp Are there Machine learning algorithms for forming Homophones from input dataset word? Homophones examples : accessary, accessory. ad, add. air, heir. all, awl. allowed, aloud. alms, arms. Input : ad Output : ad, add Are there Machine learning algorithms for forming Homophones from input dataset word taking Indian regional languages viz Hindi, Gujarati, Bengali etc and other languages viz French, German, Italian, Spanish, Dutch etc?

Topic: bag-of-words word

Category: Data Science

Machine learning algorithms for correct words formation from jumbled words

Prashant Akerkar

2020年9月14日 17:04

https://www.google.com/search?q=jumbled+words&oq=jumbled&aqs=chrome.1.69i57j0l4.3399j0j9&client=ms-android-lava&sourceid=chrome-mobile&ie=UTF-8 Can Machine learning algorithms solve the input dataset of jumbled words and form the correct words from them?

Topic: bag-of-words word

Category: Data Science

Extract editing history from Microsoft Word documents?

tsttst

2020年1月12日 20:31

Is there a tool to computationally extract the editing history of a given Microsoft Word Document? I have been using Apache Tika, but can only extract the last version of the text, and meta-information about the initial and last edit – even if edits are still being tracked via Word.

Topic: word text text-mining

Category: Data Science

Word Embedding for Item Names(integer, one-hot encoding)

Ken Kim

2019年6月20日 05:56

I am looking for the way to get the similarity between two item names using integer encoding or one-hot encoding. For example, "lane connector" vs. "a truck crane". I have 100,000 item names consisting of 2~3 words as above. also, items have its size(36mm, 12M, 2400*1200...) and unit(ea, m2, m3, hr...) I wanna make (item name, size, unit) as a vector. To do this, I need to change texts to numbers using some way. All I found is only word2vec …

Topic: word word-embeddings nlp python

Category: Data Science

Replacing words by numbers in multiple columns of a data frame in R

Yellow whale

2018年12月25日 18:03

I want to replace the values in a data set (sample in the picture) using numbers instead of words, e.g., 1 instead of D, -1 instead of R, 0 for all other values. How can I do it with a loop? I know it can be done doing this instead: (suppose d is index name) d[d$Response == "R",]$Response = -1 d[d$Response == "D",]$Response = 1 ... (other values code it and assign value of) = 0

Topic: word dataframe numerical rstudio r

Category: Data Science

Use pretrained word vectors over custom trained word2vecs

Achira Shamal

2018年9月4日 13:05

Currently i'm working on a sentiment analysis research project using LSTM networks. As the input I convert sentences into set of vectors using word2vec. And there are some well pretrained word vectors like Google word2vec. My problem is, is there are any advantages of using custom trained word2vecs(train using a dataset which related to our domain, such as user reviews of electronic items) over pretrained ones. Whats the best option use a pretrained word2vec Train our own word2vec using a …

Topic: word lstm word2vec word-embeddings

Category: Data Science

How can I get semantic word embneddings for compound terms?

hipoglucido

2017年9月4日 23:05

I need to build semantic word embeddings representation of compound terms like "electronic engineer" or "microsoft excel". One approach would be to use a standard pretrained model an average the words but, since I have a corpus of my domain, is there a possible better approach? To be more precise: The data I have is a corpus of millions of documents. Each document is ~ half a page and contains these compound terms. However there may be compound terms not …

Topic: word machine-learning

Category: Data Science

About