research

Mapping between original feature space and an interpretable feature space

iaaml

2022年5月15日 22:08

I'm reading the following really interesting paper https://arxiv.org/pdf/1602.04938.pdf on local interpretable model explanations on page 3 however particularly section 3.3 Sampling for Local Exploration they mention obtaining perturbed samples $z' \in \{0,1\}^{d'}$, it then says "we recover the sample in the original representation $z \in \mathbb{R}^{d}$ and obtain $f(z)$ " with no indication how this is done, surely the map is not injective? If not how would you know you recovered the correct sample? To this end, i wondering how …

Topic: research classification feature-extraction feature-selection machine-learning

Category: Data Science

Machine learning frameworks for tree-based models

Enk9456

2022年5月9日 11:36

Background: Its well known that Pytorch and TensorFlow are currently the most used frameworks for Deep Learning (DL) research. As far as I know, most researchers (applied or theoretical) that contribute to the field of DL usually perform experiments with Pytorch. Specifically, the level of abstraction is just right to try custom architectures or models without having to build everything from scratch. Question: What about research in another popular field of machine learning, tree-based methods and ensembles? I am thinking …

Topic: decision-trees research

Category: Data Science

What kind of regression model should I do?

user27954

2022年4月22日 19:00

my research question is the examine the effect of "receiving attention" from other members in an online community on "sustained participation" on the website. I decided to measure "sustained participation" of each user by calculating average time difference between the submissions of the user. I calculated it in the following way: and I measured "attention" by calculating total number of the comments each user received for all the submissions he/she has posted.I also want to consider total number of votes …

Topic: regression research

Category: Data Science

Topic / concept learning difficulty prediction

Maha

2022年1月31日 11:04

I was exploring the field of learning analytics. A bunch of research papers are focused on predicting course scores or grades. But, I was searching for predicting which concepts/topics students find difficult to learn / score. However, I did not find any research papers and datasets regarding this. Can someone point me to some search in "topic/concept difficulty prediction"?

Topic: research dataset

Category: Data Science

Model does not learn after ternarization of weights contrary to the paper mentioned below

puranjay mishra

2022年1月20日 06:25

I’m implementing the ‘Ternary Weights Network’ paper by Fengfu Li and Bo Zhang ( archive link - https://arxiv.org/abs/1605.04711). I’m training a simple Covnet with linear layers on the MNIST dataset. Without ternarization, the exact same model converges with high accuracy, but after ternarization of the linear layers, the model does not perform well at all. It either gets stuck in a local optima ( in which it predicts all the classes with equal probability of 0.1) , or gets up …

Topic: training computer-vision research neural-network

Category: Data Science

Adversarial attacks on non image data

blue-sky

2022年1月19日 16:05

Reading the literature around deep learning adversarial attacks it appears to be wholly concentrated on attacks of image classification models. Are there papers that describe attacks on non image data ? Searching archive for deep learning adversarial attacks appears to contain results that are just related to image classification field.

Topic: research deep-learning machine-learning

Category: Data Science

Dataset for IoT activities

Nht

2021年12月30日 22:37

First of all, I am not sure whether this is the correct Stackexchange site for this question. If not, please let me know :) My question: I am looking for a dataset pertaining to IoT activities. Ideally, it should include network traces pertaining to the activity that a device was performing at a given time. Here 'activity' means whatever the device was performing on behalf of a user (e.g.: (Device,Activity): (Smart camera, Live streaming) | (smart switch: ON OFF activity)). …

Topic: research dataset

Category: Data Science

For a student who is a beginner in quantitative research and statistics, which is the better statistical tool to start: R or IBM SPSS? Why?

Aidre Cabrera

2021年12月29日 20:32

Currently, I am writing my research design. However, I am still indecisive on what statistical tool should I use for the data analysis. I tried looking up on the internet and there are disparate answers to my question. I have noticed that R (Programming Language) and IBM Statistical Package for the Social Sciences are two of the recurring tools that are mentioned when it comes to this question. So, which is better? I need some insights so I can settle …

Topic: data-analysis research beginner statistics r

Category: Data Science

What does the term "seed lexicon" means?

ahmedshahriar

2021年11月4日 15:05

I am reading a research paper (NLP) and found the phrase "seed lexicon". Could someone please explain it in detail? Edit : A sample paper Leveraging Affective Bidirectional Transformers for Offensive Language Detection Check 3rd page right column 5th line.

Topic: research language-model nlp

Category: Data Science

Should I reshuffle the training set when benchmarking neural networks?

Minh Khôi

2021年10月8日 04:17

I'm trying to set up a fair benchmarking between various RNN models, where each of them is trained until convergence with a fixed random seed. Because the task is very costly, I am only able to run each model once and then compare their performance. By reshuffling training set, I would change the loss surface every epoch. The result is that the models converge to a more generalized minima. But assumed that my random seed is fixed and the training …

Topic: training data research deep-learning neural-network

Category: Data Science

Can we consider Meta-features of a datasets as its embeddings?

Nitin Shravan

2021年10月6日 05:30

While reading some works on meta-learning. I had this doubt. Can we consider meta-features of a dataset as it's embedding ? Given the meta-feature is a lower dimensional representation which also try to retain properties of a dataset. Embeddings are essentially low dimension representation of some high dimensional concept. Is it fair to use "embeddings" instead of "meta-features" ? or can we use "representation" instead of "meta-features"

Topic: meta-learning representation embeddings data research

Category: Data Science

Developing a deep learning hybrid architecture for a particular problem is a highly complicated task

Ahmad

2021年9月16日 14:03

I am currently conducting research on application of deep learning (sensor signal recognition). I spent about a year and a half sifting through the literature and discovered some research patterns. To begin, I noted the emergence of Convolutional Networks (CNNs). Individuals applied CNN to their problems and reported state-of-the-art outcomes. Then LSTM was proposed; it was quickly adopted and declared state-of-the-art. Then the trend shifted; people began to use hybrid architectures and reported cutting-edge results. The current trend is to …

Topic: transformer computer-vision research deep-learning time-series

Category: Data Science

Interactive plot of topics over time

Mehdi Doubiani

2021年8月21日 16:13

I am working on an NLP program to extract and analyze topics of research papers based only abstracts. I would like to have a plot like this one: But when I click on a line I would like to have a new page open and have a list of the top N relevant research papers and the topic prevalence percentage next to the title, and abstract text. All of this in Python. Do you know a way to achieve this? …

Topic: python-3.x research nlp

Category: Data Science

[Guidance Needed]Research in data science

Dagad

2021年5月30日 05:34

IMAGINE You're a research intern and you're an undergraduate student. Have some experience in data science and now new to research. Your task is to conduct research on vision transformers. Oh and by the way, you're new to transformers concepts. You first learn transformers and make your fundamentals strong. You implemented and checked vision transformers original code. All fine! You even checked and ran current SOTA papers in vision transformers. Now you have gained enough knowledge to conduct your own …

Topic: tensorflow research

Category: Data Science

How to measure Entity Ambiguity?

Abdulrahman Bres

2021年2月4日 14:42

When using/building a system for Entity Linking, is there a well-known measure for "ambiguity degree" of an entity? Some approach to compare named entities regarding how difficult to disambiguate?

Topic: metric research named-entity-recognition text-mining nlp

Category: Data Science

Spectral Networks and Deep Locally Connected Networks on Graphs

An Ignorant Wanderer

2021年1月22日 15:49

I’m reading the paper Spectral Networks and Deep Locally Connected Networks on Graphs and I’m having a hard time understanding the notation shown in the picture below (the scribbles are mine): Specifically, I don’t understand the notation for the matrix F. Why does it include an i and a j?

Topic: graph-neural-network research

Category: Data Science

How to determine the abnormality of a specific variable by taking into account all the other variables in the data?

AdrienC

2021年1月6日 23:44

I have an issue of machine learning/anomaly detection. Indeed, I have a variable Y and several other variables X. The purpose is to quantify the degree of abnormality of the data on Y but I have to take into account the values on the other variables (the relationship between Y and X). Normally, an anomaly detection algorithm would find anomalies but on the whole data (Y + X), but in my case I want to zoom in on Y because …

Topic: anomaly anomaly-detection research machine-learning

Category: Data Science

Custom Loss Function Equation

Ahmad Anis

2020年12月12日 10:13

I am trying to reproduce a research paper, where it is a classification problem, and they have introduced a custom loss function that I am unable to understand. Now I think I have to implement the equation (8) or equation (7) and I am using Tensorflow framework, but I am not able to understand the equation (8) as we have to input both Actual and Predicted Features in custom loss, but there answer is klog(lambda+1) Similarly in equation 7, they …

Topic: mathematics tensorflow loss-function research

Category: Data Science

How to conclude the generality of any classification methods?

mallea

2020年11月10日 16:13

Suppose a classification task A, and there exist a lot of methods $M_1, M_2, M_3$. The task $A$ is measured by a consistent measure. For instance, the task A can be a binary classification. In this case, F-score, ROC curve can be used. I did a survey on some research are and found that $M_1$ is evaluated with dataset $D_1$ (open) using pre-processing $P_1$ only (seems the seminal work). $M_2$ is evaluated with dataset $D_1$ (open), $D_2$ (private) and compared …

Topic: data preprocessing research

Category: Data Science

Actual problems in Data Science/Machine Learning connected with music

taciturno

2020年10月19日 11:41

I'm on my $4^{th}$ year now and searching for a theme for my diploma in mathematics & computer science specialization. Is there are interesting problems or fields for researches that i can explore? Probably i'd wanted to work in fields connected to art - music reccomendations e.g. (but we have Spotify sure). I was suggested to take theme "Authorship attribution in classical music", but i think this theme is rather explored.

Topic: research

Category: Data Science

About