structured-data

How to arrange web scraped data in a table using R?

user151030

2022年5月27日 08:10

Original Code library(netstat) library(RSelenium) library(tidyverse) obj<-rsDriver(browser="chrome",chromever="101.0.4951.15",verbose=F,port=free_port()) remDr<-obj$client remDr$navigate('https://www.imdb.com/search/title/?year=2022&title_type=feature&') Title<-remDr$findElements(using='css','.lister-item-header a') lapply(Title,function(x) { x$getElementText()%>% unlist() }) o/p: [[1]] 1 "Doctor Strange in the Multiverse of Madness" [[2]] 1 "Senior Year" My attempts to arrange data in tabular form- 1.movies=data.frame(Title,stringsAsFactors=FALSE) view(movies) **Error in as.data.frame.default(x[[i]], optional = TRUE) : cannot coerce class ‘structure("webElement", package = "RSelenium")’ to a data.frame** 2.movies=data.frame(x,stringsAsFactors=FALSE) view(movies) **Error in data.frame(X, stringsAsFactors = FALSE) : object 'X' not found** 3.Part of original code tweaked- lapply(Title,function(x) { **t<-list(x$getElementText()%>% unlist())** }) l=data.frame("movie"=t,stringsAsFactors …

Topic: structured-data web-scraping visualization r

Category: Data Science

All-to-all modeling for structured dataset?

Johnny Tam

2022年5月20日 02:27

I have a structured dataset with rows as different samples and columns as different attributes of the samples. Interestingly, the attributes are highly inter-correlated (i.e. a complex system). I want to understand the system by training many classifier mdoels, with each model taking a column as the target and all the other columns as the features (which here I call such modeling "all-to-all"). Because the attributes and targets are highly correlated, many models should perform at reasonable accuracies. Before actually …

Topic: structured-data classification

Category: Data Science

How can deep learning be applied to association rule mining?

user3352632

2022年5月18日 12:01

Association rule mining is considered to be an old technique of AI. Rules are mined on statistical support. How can deep learning be applied to this? What are approaches for structured data (in a graph format like XML)? XML documents are structured by tags. My goal is to extract a rule that says that tag x is often combined with tag y and z. Then, I later want to apply these rules and if a tag y and z is …

Topic: knowledge-graph structured-data association-rules deep-learning data-mining

Category: Data Science

Is it possible to use structured(tabular) data as a reinforcement learning environment?

aby

2022年3月9日 17:20

I want to do an RL project in which the agent will learn to drop duplicates in a tabular data. But I couldn't find any examples of RL being used that way - checked the RL based recommendation systems if they use a user-item interaction matrix as in collaborative filtering. I am wondering if it's really possible and how to define the problem (e.g. if episodic; episode terminates when the agent is done iterating over all data samples etc.). Can …

Topic: structured-data reinforcement-learning data-cleaning machine-learning

Category: Data Science

Convert natural language text to structured data

Rajesh

2022年3月7日 12:02

Convert natural language text to structured data. I'm developing a bot to help user assist in identifying Apparels. The problem is to convert natural language text to structured data (list of apparels) and query the store's inventory to find the closest match for each item. For example, consider the following user input to the bot. "I would like to order regular fit blue jeans with hip size 32 inches" and the desired output will be the following [ { "quantity": …

Topic: structured-data nlp

Category: Data Science

Are there differences in preprocessing nominal vs ordinal vs interval vs ratio data

Mindaugas Bernatavičius

2021年12月15日 13:10

I wonder are there significant differences that ought to be known when preprocessing nominal vs ordinal vs interval vs ratio. Intuitively, it seems like encoding ordinal values should be performed using one-hot encoding to not introduce ordering assumptions artificially, and ordinal data (bad, better, best) using ordinal encoding (1,2,3) to preserve the order (although it does introduce scale, effectively making ordinal data into interval data it appears). Also, scaling the data seems problematic - if I were to encode labels …

Topic: structured-data one-hot-encoding preprocessing encoding

Category: Data Science

How to structure unstructured data

T97

2021年6月23日 18:37

I am analysing tweets and have collected them in an unstructured format. What is the best way to structure this data so I can begin the data mining processes? Somebody suggested using python packages such as spacy but not sure how to go about using this.

Topic: structured-data nlp data-mining

Category: Data Science

Heterogeneous clustering with text data

qbit-

2021年1月4日 11:05

I have a dataset which consists of multiple user ratings. Each rating looks similarly to: | Taste | Flavour | Look | Enjoyed | ..... | Tag | |-------|---------|------|---------|-------|--------| | 4 | 2 | 2 | 3 | ..... | Banana | | 5 | 4 | 1 | 2 | ..... | Apple | | 3 | 1 | 4 | 1 | ..... | Pasta | | .... | .... | .... | .... | .... | .... …

Topic: structured-data weighted-data word-embeddings clustering

Category: Data Science

How do I discern document structure from differently-tagged XML documents?

John

2020年4月6日 23:47

I have a body of PDF documents of differing vintage. Our group had exported the documents as text to feed them into a natural-language parser (I think) to pull out subject-verb-predicate triples. This hasn't performed as well as hoped so I exported the documents as XML using Acrobat Pro, hoping to capture the semantic document structure in order to pass it in as a hint to the text parser. One document looked pretty good (something like this): <TaggedPDF-doc> <bookmark-tree>...</bookmark-tree> <Sect>...</Sect> …

Topic: structured-data normalization text-mining machine-learning

Category: Data Science

What are the best practises to decide whether a variable is categorical?

Shiv

2019年8月23日 13:47

What are some of the systematic ways to categorise variables into categorical or numeric? I believe using only intuition in such scenarios can many-a-times lead to major irreversible errors. What are the best strategies when categorising variables? For example, the dataframe I'm working has several categorical variables such as is_holiday that has labels for several holidays. However certain variables like visibility_in_miles suggest that those too need to be treated as categorical. part of the reason is that while most variables …

Topic: structured-data data-cleaning categorical-data

Category: Data Science

Are CNNs applicable on structured data?

disney82231

2019年4月21日 14:55

I can use CNN to classify MNIST images, but I don't know whether CNNs are applicable on iris data as well? If not, why?

Topic: structured-data cnn deep-learning neural-network machine-learning

Category: Data Science

About