bias - Geeks Mental

Loss function to prevent estimator bias

user1867639

2022年5月31日 21:19

I have a regression problem I'm trying to build a model for: Predicting sales per person (>= 0) depending on some variables. I'm running different model types and gave deep neural networks a try. The loss functions I'm using are mean squared error and mean absolute error (or sometimes a mix). I often run into this issue though, that despite mse and mae are being optimized, I end up with a very strong bias in the prediction, e.g. sum(training_all_predictions) / …

Topic: bias keras regression neural-network r

Category: Data Science

Difference between ethics and bias in Machine Learning

Qwerty

2022年5月31日 15:12

I'm confused about the difference between "ethics" and "bias" when those concepts are discussed in the context of Machine Learning (ML). In my understanding, ethical issue in ML is pretty much exactly the same thing as "bias": say, the model discriminates people of color and this is the same as to say that the model is biased. In short, "ethics is always a bias, but it is not necessarily true that a bias is always an ethical issue". Is this …

Topic: ethical-ai bias terminology machine-learning

Category: Data Science

Learning high bias in neural net

Riku Iki

2022年5月24日 03:41

I have this simple model, which tries to predict constant $[1, 1, .. 1, 0, ..., 0]$ vector regardless of input. I found that model predicts it successfully if trained on input in $[0,10]$ range, however model's predictions are always $[0...0]$ vectors if model is trained on input in $[750, 770]$ range. I was thinking model should converge to high bias weights and still be able to predict constant vector even for larger training inputs. Maybe anyone can advice what …

Topic: bias keras tensorflow neural-network

Category: Data Science

bias variance decomposition for classification problem

IamTheRealFord

2022年5月13日 10:03

It is given that: MSE = bias$^2$ + variance I can see the mathematical relationship between MSE, bias, and variance. However, how do we understand the mathematical intuition of bias and variance for classification problems (we can't have MSE for classification tasks)? I would like some help with the intuition and in understanding the mathematical basis for bias and variance for classification problems. Any formula or derivation would be helpful.

Topic: bias mathematics variance classification

Category: Data Science

How to provide Intentional Bias towards recent examples in Text Classification?

Himanshu Tanwani

2022年4月27日 12:32

I have trained an XGBClassifier to classify text issues to a rightful assignee (simple 50-way classification). The source from where I am fetching the data also provides a datetime object which gives us the timestamp at which the issue was created. Logically, the person who has recently worked on an issue (say 2 weeks ago) should be a better suggestion instead of (another) person who has worked on similar issue 2 years ago. That is, if there two examples from …

Topic: bias text-classification xgboost preprocessing classification

Category: Data Science

how to test if the target variables is correlated with protected variables?

Saif

2022年4月19日 23:38

I wonder how to check if the protected variables in fairness either encoded in the other features (non-protected). Or if they are not sufficiently correlated with target variables so adding them does not improve performance in predication(classification)?. If there is a Python tutorial showing that , it will be useful. Regards,

Topic: bias

Category: Data Science

Derivative of Loss wrt bias term

Gonzalo Sanchez cano

2022年4月3日 13:01

I read this and have an ambiguity. I try to understand well how to calculate the derivative of Loss w.r.t to bias. In this question, we have this definition: np.sum(dz2,axis=0,keepdims=True) Then in Casper's comment, he said that the The derivative of L (loss) w.r.t. b is the sum of the rows $$ \frac{\partial L}{\partial Z} \times \mathbf{1} = \begin{bmatrix} . &. &. \\ . &. &. \end{bmatrix} \begin{bmatrix} 1\\ 1\\ 1\\ \end{bmatrix} $$ But actually, using axis=0, is it not …

Topic: derivation bias loss-function gradient-descent

Category: Data Science

Do these values of bias and variance make sense?

kasofi9051

2022年3月28日 17:48

I have this code: X_train, X_test, y_train, y_test = train_test_split(X, Y, test_size=0.3, random_state=1) model = LinearRegression().fit(X_train, y_train) from mlxtend.evaluate import bias_variance_decomp print(y_train.min(), y_train.max(), y_test.min(), y_test.max()) #for your understanding of the data: 7283 517924 11510 450000 avg_expected_loss, avg_bias, avg_var = bias_variance_decomp( model, X_train, y_train.ravel(), X_test, y_test.ravel(), loss='mse', random_seed=1) print('Average expected loss: %.3f' % avg_expected_loss) print('Average bias: %.3f' % avg_bias) print('Average variance: %.3f' % avg_var) The result is: Average expected loss: 542162695.679 Average bias: 529311955.129 Average variance: 12850740.550 To me, these values …

Topic: bias variance scikit-learn python machine-learning

Category: Data Science

Handling bias inputs during normalization

Mrnobody

2022年3月21日 01:01

Suppose I have an input matrix $\mathbf X\in \mathbb R^{(D+1)\times N}$ where $N$ is number of samples $D$ is dimension of an input vector $x$ and extra $1$ dimension is for bias where all bias entries are $1$. If I want to normalize all inputs by subtracting mean and dividing by standard deviation how should I handle bias entries? Should they stay same as $1$

Topic: bias normalization

Category: Data Science

Visualizing the equation for separating hyperplane

Rnj

2022年3月19日 02:03

I was wondering if I can visualize with the example the fact that for all points $x$ on the separating hyperplane, the following equation holds true: $$w^T.x+w_0=0\quad\quad\quad \text{... equation (1)}$$ Here, $w$ is a weight vector and $w_0$ is a bias term (perpendicular distance of the separating hyperplane from the origin) defining separating hyperplane. I was trying to visualize in 2D space. In 2D, the separating hyperplane is nothing but the decision boundary. So, I took following example: $w=[1\quad 2], …

Topic: linearly-separable bias classification machine-learning

Category: Data Science

What is the defining Set in NLP

chikitin

2022年3月17日 14:05

I am reading the paper Man is to Computer Programmer as Woman is to Homemaker? Debiasing Word Embeddings here is the pdf. On page 6, we read: Step 1: Identify gender subspace. Inputs: word sets W , defining sets D_1 , ..., D_m. However, they paper before and after this statement does not mention what these defining sets are? Can anyone give me a definition or description of these sets? Thank you.

Topic: bias word-embeddings

Category: Data Science

How to manage sampling bias between training data and real-world data?

Matt

2022年2月16日 17:02

I'm currently working on a binary classification problem. My training dataset is rather small with only 1000 elements. (I don't know if it is relevant : my problem is similar to the "spam filtering" problem where a data can also be "likely" to be categorized as spam but i simplified it as a black or white issue, and use the probability given by the models to assign a likelihood score) Among those 1000 elements: 70% are from the class 1 …

Topic: bias distribution domain-adaptation model-selection data-cleaning

Category: Data Science

Math behind, MSE = bias^2 + variance

Fatemeh Asgarinejad

2022年2月14日 23:00

Based on the deeplearningbook: $$MSE = E[(\theta_m^{-} - \theta)^2]$$ $$equals$$ $$Bias(\theta_m^{-})^2 + Var(\theta_m^{-})$$ where m is the number of samples in training set, $\theta$ is the actual parameter in the training set and $\theta_m^{-}$ is the estimated parameter. I can't get to the second equation. Further, I don't understand how the first expression is gained. Note: $Bias(\theta_m^{-})^2 = E(\theta_m^{-2}) - \theta^2$ Also how bias and variance evaluated in classification.?

Topic: mse bias variance estimators

Category: Data Science

Amount of data needed for deep learning vs support vector machine

Gesetzt

2022年2月12日 20:00

I often read about the fact, that the amount of data to train and get a generalizing model for a deep learning algorithm is much higher in comparison, e.g. to a support vector machine. It makes sense, because of the huge amount of parameters in a deep learning approach, which potentially leads to overfitting. However: Are there any systematic studies on this? Do deep learning approaches really need more data? Best regards, Gesetzt

Topic: bias overfitting neural-network svm

Category: Data Science

Model Selection using Bias Variance Trade Off

Manish

2022年2月11日 12:02

I have a Regression Model with Train MAPE as 6% and Test MAPE as 15%. This appears to me as a clear case of over fitting. But can I still use this model assuming 15% Error is not a bad number after-all. Is this there a flaw in this thinking?

Topic: bias variance model-selection machine-learning

Category: Data Science

Bias and variance in the model o in the predictions?

SRG

2022年2月8日 04:06

This topic confuses me. In the literature or articles, when talking about bias and variance in automatic learning, specifically in cross-validation, do they refer to the high bias (underfitting) and high variance (overfitting) in the model? Or do they refer to the bias and variance of the predictions obtained in the iterations of the cross-validation? How to handle each case?

Topic: bias variance cross-validation machine-learning

Category: Data Science

Look ahead bias predicting a time series using features

JmML

2022年1月29日 21:18

I am making some ML methods (RF, RNN, MLP) to predict a time series value 'y' based on features 'X' and not the time series 'y' itself. My question is regarding the bias I might be including since I am doing a simple random train-test-split for the fit and evaluation process, so I am using data from different days (past and future) and not spliting by time. Is it valid for this prediction process, or even that I am not …

Topic: bias rnn time-series

Category: Data Science

Backpropagation of Bias in Neural networks

Soon

2022年1月19日 13:59

My goal is to calculate backpropagation(Especially the backpropagation of the bias). For example, X, W and B are python numpy array, such as [[0,0],[0,1]] , [[5,5,5],[10,10,10]] and [1,2,3] for each. And, suppose dL/dY is [[1,2,3],[4,5,6]]. How to calculate dL/dB? Answer should be [5, 7, 9]. why is it calculated that way?

Topic: bias backpropagation

Category: Data Science

learning curves of a classification algorithm

xavi

2022年1月12日 19:38

I a trying to understand this learning curve of a classification problem. But I am not sure what to infer. I believe that I have overfitting but I cannot sure. Very low training loss that’s very slightly increasing upon adding training examples. "Gradually decreasing validation loss (without flattening) upon adding training examples". However, I do not see any gap at the end of the lines something that is usually can be found in an overfitting model On the other hand, …

Topic: bias variance classification machine-learning

Category: Data Science

Keras model prediction always has unwanted offset

Alper91

2022年1月3日 13:05

I am trying to predict next 10 days by looking into the last 60 days. So tried to implement an LSTM layer. Before jumping into the question, I want to clarify a few points. Firstly, this is a Multiple Parallel Input and Multi-Step Output problem as it is described in the link. I collected the data of the last 5 years of all funds available in my country from this address. I refined my data as much as possible. Of …

Topic: bias lstm keras python

Category: Data Science

About