terminology

Difference between ethics and bias in Machine Learning

Qwerty

2022年5月31日 15:12

I'm confused about the difference between "ethics" and "bias" when those concepts are discussed in the context of Machine Learning (ML). In my understanding, ethical issue in ML is pretty much exactly the same thing as "bias": say, the model discriminates people of color and this is the same as to say that the model is biased. In short, "ethics is always a bias, but it is not necessarily true that a bias is always an ethical issue". Is this …

Topic: ethical-ai bias terminology machine-learning

Category: Data Science

chart x-axis spacing terminology question

cherouvim

2022年5月1日 04:06

In the following hand made charts I show some value for years. In the first chart I've evenly spaced each year. On the second chart I've spaced them relativelly to their actual year value within time (i.e 2016 is closer to 2017 than 2010). Is there a terminology for the spacing of the second chart? Imagine building a software which would have a toggle control to switch the view from A to B. How would you call it?

Topic: plotting terminology visualization

Category: Data Science

Is sensitivity the same as recall in multiclass classification?

penguin_smasher

2022年4月10日 21:00

In Wikipedia, it is stated "In binary classification, recall is called sensitivity" under the Recall section. Are they both different in case of multi-class classification?

Topic: metric multiclass-classification terminology evaluation performance

Category: Data Science

Meaning of 'Closed Form'

Apoorva

2022年3月19日 18:29

Here's an excerpt from a paper explaining Logistic Regression. What does 'Closed Form' mean in this context? Please explain in simple terms. The definitions online are confusing. Gradient of Log Likelihood: Now that we have a function for log-likelihood, we simply need to choose the values of theta that maximize it. Unfortunately, if we try just setting the derivative equal to zero, we’ll quickly get frustrated: there’s no closed form for the maximum. However, we can find the best values …

Topic: terminology logistic-regression

Category: Data Science

Terminology for meta content inside text documents

Dave

2022年3月3日 18:47

My problem is that I want to systematically handle document internal meta-content in NLP processing, but I don't know how to find relevant resources. By meta-content I'm referring to content that exists within the documents in the corpus, but that describe/refer to other content in the document. For example, in the standard memo headers: From: Bob To: Alice Date: 3 March 2022 Re: Quarterly Sales Figures <body> the strings "From", "To", "Date", "Re" are meta content that specify the role/function/meaning …

Topic: terminology nlp

Category: Data Science

Categories of time series

NerdOnTour

2022年2月7日 15:27

I am trying to classify different kinds of time series, but find myself missing good vocabulary. I don't mean that, for a given time series, I try to classify its datapoints into clusters or the like. Instead, I am looking for categories into which I can sort different time series. Two time series belonging to the same category should be apt for similar time series analysis. For instance, in some processes measured you find that essentially only two values are …

Topic: terminology time-series

Category: Data Science

In Neural Networks and deep neural networks what does label-dropout mean

Michael

2022年2月6日 12:40

If you take the following sentence from an article on deep neural networks to regularize the classifier layer by estimating the marginalized effect of label-dropout during training. What does label-dropout mean

Topic: dropout terminology deep-learning neural-network classification

Category: Data Science

Could anyone explain these terms, "input space", "feature space", "sample space", "hypothesis space", "parameter space" with a concrete example?

JJJohn

2022年1月9日 16:08

People use these terms "input space", "feature space", "sample space", "hypothesis space", "parameter space" in machine learning. Could anyone explain these terms with a concrete example, such as sklearn MNIST dataset?, which has 1797 Samples, 10 Classes, 8*8 Dimensionality and 17 Features. Please do NOT talk about in general. For example, in this particular case, is the feature space a set of 17 elements {0, 1, ..., 16}?

Topic: terminology machine-learning

Category: Data Science

Difference between prototype and centroid

Marzi Heidari

2021年12月30日 15:59

Are these two terms "prototype" and "centroid" exchangeable? I know prototypes can be calculated using the mean of the features. Is it the same for centroid?

Topic: terminology

Category: Data Science

What is the difference between outlier detection and anomaly detection?

Martin Thoma

2021年12月4日 06:42

I would like to know the difference in terms of applications (e.g. which one is credit card fraud detection?) and in terms of used techniques. Example papers which define the task would be welcome.

Topic: terminology anomaly-detection outlier algorithms definitions

Category: Data Science

What is the difference between a "cell" and a "layer" within neural networks?

user38283

2021年11月5日 04:27

So I understand what "layers" are. If you have 5 layers in your model, your data basically gets transformed 5 times via 5 activation functions. The number of "neurons" within a layer dictate how many outputs a layer creates. So what are "cells"? I never understood where "cells" come into play. Are they a collection of layers? Per Wiki: https://en.wikipedia.org/wiki/Long_short-term_memory If the orange are layers, then I would imagine each has a bunch of neurons. So a cell is a …

Topic: terminology deep-learning neural-network

Category: Data Science

What is a tower?

Benedikt S. Vogler

2021年11月3日 00:12

In many tensorflow tutorials (example) "towers" are mentioned without a definition. What is meant by that?

Topic: tensorflow terminology deep-learning

Category: Data Science

Deep learning / computer vision technique: aggregating many input images to a single representation of the features within

mluerig

2021年10月20日 21:39

I have a few thousand grayscale images, and I would like to generate a universal representation of the patterns within - a semantic/ordered composition of all features, so to speak. For instance, take 10000 images of a dog and draw the archetypical dog. Does this task have a technical name, and is there a method out there specifically for such purposes? I guess this similar to what happens during the training of a neural network. I just don't necessarily need …

Topic: feature-construction terminology computer-vision deep-learning

Category: Data Science

difference between novelty, concept drift and anomaly

user4556432

2021年10月14日 17:20

Concept drift is when the relation between the input data and the target variable changes over time. like changes in the conditional distribution. is novelty an outlier? what should I think of? what is the difference between concept drift and novelty and anomaly? is the concept drift considered a type of novelty? how exactly? can you please explain !!

Topic: terminology data-mining

Category: Data Science

Loss function in GradientBoostingRegressor

user3902660

2021年9月19日 04:39

Scikit Learn GradientBoostingRegressor: I was looking at the scikit-Learn documentation for GradientBoostingRegressor. Here it says that we can use 'ls' as a loss function which is least squares regression. But I am confused since least squares regression is a method to minimize the SSE loss function. So shouldn't they mention SSE here?

Topic: loss-function terminology gbm scikit-learn machine-learning

Category: Data Science

What do "Under fitting" and "Over fitting" really mean? They have never been clearly defined

rain keyu

2021年8月16日 19:22

I am always getting lost when dealing with these terms. Especially being asked questions about the relationship such as underfitting-high bias (low variance) or overfitting-high variance (low bias). Here is my argument: From wiki: In statistics, **overfitting is "the production of an analysis that corresponds too closely or exactly to a particular set of data, and may therefore fail to fit additional data or predict future observations reliably".1 An overfitted model is a statistical model that contains more parameters than …

Topic: bias overfitting terminology machine-learning

Category: Data Science

What is the appropriate approach for training time series data against multiple, consecutive labels?

Alexander Gruber

2021年6月7日 00:53

Let's say we have a time series $\{{\bf x}_i\}$ of features and are trying to learn to predict a time series $\{t_i\}$ using a neural network. Our goal is to be able to predict the time series value $t$ for both tomorrow and the day after tomorrow given the features ${\bf x}$ observed today. My initial thoughts are: If our goal is to be able to predict $t$ tomorrow given features ${\bf x}$ observed today, we would make our (training, …

Topic: training terminology neural-network time-series

Category: Data Science

Terminology: what is the correct word for the quantity of "association" between x value and classes

user7834

2021年5月18日 16:27

I am looking for a word for the quantity of "association" between a variable x value and a classification class. For example, let's imagine we have two two classes $A$ and $B$ and two inputs $\mathbf{x}_A$ and $\mathbf{x}_B$ from each class respectively. The probability of membership in each class varies roughly linearly between these two inputs. I want a word to say the "association" of the input $\mathbf{x}_C = a \mathbf{x}_A + b \mathbf{x}_B$ with $A$ is determined by $a$ …

Topic: terminology classification

Category: Data Science

I need help to write an essay about: probabilistic method || Occam's razor || mathematics in the 21st

Almendrof66

2021年5月7日 22:21

I am interested in one of the master's programs in Data Science. In the application process I need to submit an essay of 1,000 words about one of the following topics: Drawbacks of the probabilistic method A mathematical theory for Occam's razor What drives mathematics in the 21st century? I have a background in numerical analysis and I do not have any idea about how I can tackle these topics. I would like if someone familiar with one of these …

Topic: probability terminology beginner

Category: Data Science

Performance measure: Why is it called recall and sensitivity?

Ahmad

2021年4月9日 04:51

precision is the fraction of retrieved instances that are relevant, while recall (also known as sensitivity) is the fraction of relevant instances that are retrieved. I know their meaning but I don't know why it is called recall? I am not a native-speaker of English. I know recall means remember, then I don't know the relevance of this meaning to this concept! maybe coverage was better because it shows how many instances were covered...or any other term. Moreover sensitivity is …

Topic: terminology performance

Category: Data Science

About