linear-models

Temperature lag forecasting

Clankk

2022年5月25日 12:54

I am working on a data science project on an industrial machine. This machine has two heating infrastructures. (fuel and electricity). It uses these two heatings at the same time, and I am trying to estimate the temperature value that occurs in the thermocouple as a result of this heating. However, this heating process takes place with some delay/lag. In other words, the one-unit change I have made in fuel and electrical heating is reflected in the thermocouple hours later. …

Topic: linear-models machine-learning-model correlation time-series

Category: Data Science

Feature importance of a linear regression

NAS

2022年5月19日 06:10

What is the easiest and easy to explain feature importance calculation for linear regression? I know I can use Shap to compute feature importance, but I find it difficult to explain it to stakeholders, and the coefficient is not a good measure of feature importance since it depends on the scale of the feature. Some suggested (standard deviation of feature)*feature coefficient as a good measure of feature importance.

Topic: feature-importances linear-models linear-regression statistics machine-learning

Category: Data Science

Can absolute or relative contributions from X be calculated for a multiplicative model? $\log{ y}$ ~ $\log {x_1} + \log{x_2}$

Ben

2022年5月8日 00:04

(How) can absolute or relative contributions be calculated for a multiplicative (log-log) model? Relative contributions from a linear (additive) model E.g., there are 3 contributors to $y$ (given by the three additive terms): $$y = \beta_1 x_{1} + \beta_2 x_{2} + \alpha$$ In this case, I would interpret the absolute contribution of $x_1$ to $y$ to be $\beta_1 x_{1}$, and the relative contribution of $x_1$ to $y$ to be: $$\frac{\beta_1 x_{1}}{y}$$ (assuming everything is positive) Relative contributions from a log-log …

Topic: logarithmic linear-models interpretation data-science-model

Category: Data Science

What is the SHAP values for a liner model? How do we derive that?

NAS

2022年5月2日 15:31

What is the SHAP values for a linear model? it is given as below in the documentation Assuming features are independent leads to interventional SHAP values which for a linear model are coef[i] * (x[i] - X.mean(0)[i]) for the ith feature. Can someone explain to me how it is derived? Or direct me to a resource explaining the same?.

Topic: linear-models shap explainable-ai machine-learning

Category: Data Science

How to write this TeX equation appropriately for publication?

KKW

2022年4月12日 20:44

I have a 2 variables that resemble 2 different tests. I would like to multiple it by 0.2 with conditions. If test1 is available and test2 is not (dataset would show NA), use test1. Same condition for test2, if test2 is available and test1 is not, use test2 values. If both test1 and test2 are available, then use the minimum of both value. Below is my formula, is there a more accurate one? $$ total score_i = 0.2*test_i\begin{cases}test1&test1>0\\\\test2&test2>0\\\\min(test1,test2)&test1\ and\ test2 …

Topic: linear-models

Category: Data Science

Is it possible to explain why Lasso models eliminated certain coefficient?

NAS

2022年2月16日 08:30

Is it possible to understand why Lasso models eliminated specific coefficients?. During the modelling, many of the highly correlated features in data is being eliminated by Lasso regression. Is it possible why precisely these features are being eliminated from the model? (Is it the presence of any other features/multicollinearity etc.? I want to explain the lasso model behaviour. Your help is highly appreciated.

Topic: linear-models lasso regularization linear-regression correlation

Category: Data Science

Is it possible to find the feature importance of an aggregate feature from the corresponding independent features in a linear model?

NAS

2022年2月10日 17:04

I have a model to predict energy consumption in a food processing plant. Different food products are produced in the plant. My model is given as Energy consumption(Kwhr) = alpha0 +alpha1(Food Item A produced in Kg) + alpha2(Food Item A produced in Kg) +alpha3(Food Item C produced in Kg)+....+Other variables Since different product categories have different energy intensities, I would like to add that detail to the model. Can I derive the feature importance of the Total production on the …

Topic: linear-models machine-learning-model linear-regression

Category: Data Science

Can I include a quotient as dependent variable and independent variables with same denominator in a linear model? How do we interpret such models?

NAS

2022年2月9日 02:18

I want to create a model in a food processing plant where my dependent variable is Electricity (KWhr) consumption per kg. Plant produce different food items with varying electricity consumption. I'm interested in knowing the impact of the proportion of each food item on consumption per kg. so my model is consumption per kg produced (Kwhr/kg) = alpha0 +alpha1(Food Item A/Total Production) + alpha2(Food Item B/Total Production)+....+Other variables Is it correct to frame the question like this?. I have Total …

Topic: linear-models data-science-model machine-learning-model data linear-regression

Category: Data Science

Lasso (or Ridge) vs Bayesian MAP

dzheng1887

2022年1月18日 08:03

This is the first time I have posted here. I am looking for some feedback or perspective on this question. To make it simple, let's just talk about linear models. We know the MLE solution for the $l_1$ loss objective is the same as the Bayesian MAP estimate with a Laplace prior for each parameter. I'll show it here for convenience. For vector $Y$ with $n$ observations, matrix $X$, parameters $\beta$, and noise $\epsilon$ $$Y = X\beta + \epsilon,$$ the …

Topic: linear-models lasso theory bayesian

Category: Data Science

Ideas to enforce uniformity of error in linear models

Bubble

2022年1月12日 13:59

I am looking for ideas to not only solve the least square problem, but to enforce errors to be roughly similar. One idea I had is to add the variance of errors in the classical Ordinary Least Square problem. My criterion with respect to matrix A, x and y being vectors, would be as follow: $$ J(A) = \mu_e + \lambda\sigma_e $$ where $$ \mu_e = ||Ax-y||²=\sum{e_i}=\sum||Ax_i - y_i||² $$ and $$ \sigma_e = \sum (e_i - \mu_e)² $$ A …

Topic: linear-models variance optimization machine-learning

Category: Data Science

Approaches for multiclass classification with a reference level to extract variables of importance?

fridaymeetssunday

2022年1月7日 13:00

I have a dataset with with multiple classes (< 20) which I want to classify in reference to one of the classes.The final goal is to extract the variables of importance which are useful to distinguish each of the classes vs reference. If it helps to frame the question, an example would be to classify different cancer types vs a single healthy tissue and determine which features are important for the classification of each tumour. My first naive approach would …

Topic: linear-models machine-learning-model multiclass-classification random-forest feature-extraction

Category: Data Science

How to visualize optimization problems' feasible region?

Mina Ashraf

2021年10月28日 01:04

Is there any tool to visualize the feasible region when given a set of Linear equations (equalities and inequalities). If not, can anyone suggest a way to visualize it? If I am going to do it myself using Python, which libraries should I use. I have found sympy, but I couldn't get it to draw inequalities nor draw the intersections only. I have also found wolfram, but I could only see pre-built visualizations and not visualize my own system. Can …

Topic: linear-models graphical-model optimization visualization

Category: Data Science

Does the appliance of R-squared to non-linear models depends on how we calculate it?

mathgeek

2021年10月12日 00:43

Does the appliance of R-squared to non-linear models depends on how we calculate it? $R^2 = \frac{SS_{exp}}{SS_{tot}}$ is going to be an inadequate measure for non-linear models since an increase of $SS_{exp}$ doesn't necessarily mean that the variance is decreasing, but if we calculate it as $R^2 = 1 - \frac{SS_{res}}{SS_{tot}}$, then it's as much meaningful for non-linear models as it is for linear ones. I asked a similar question here where I showed that R-squared is no worse for …

Topic: linear-models r-squared linear-regression regression machine-learning

Category: Data Science

The effect of the λ in the Ridge regression

Dablup

2021年9月21日 08:52

Why by increasing value of λ in Ridge estimator the slope of the line is decreasing? How exactly λ affects to the y = kx + b?

Topic: linear-models ridge-regression linear-regression regression machine-learning

Category: Data Science

Understanding the math behind linear classification

lil data scientist

2021年7月13日 00:58

For example we have $X$ train data, $y$ and $w$ Our margin is $M = y_i \langle w, x_i \rangle$ If $M_i > 0$ classifier return True predict and otherwise, if $M_i < 0$ we get False predict. How does it work? $y_i = \langle w, x_i \rangle$ , it means they have same sign, and if we multiply them, the multiply product will be always positive, because plus * plus = plus and minus * minus = plus. Otherwise …

Topic: linear-models mathematics machine-learning-model classification machine-learning

Category: Data Science

Does linear kernel make SVM a linear model?

tabumis

2021年6月20日 17:59

I have deleloped several SVR models for my case study using the linear kernel, and those models were optimized using the RMSE as criterion. Now Im searching for additional evaluation metrics and it turns the most publications use R squared to compare model performance during training and validation phases. It's generally suggested to avoid to use R-squared to assess the model if it uses non-linear kernel such as polynominal or radial basis function. And this refers to the fact that …

Topic: linear-models kernel svm

Category: Data Science

Linear models: Imputing missing not at random

thereandhere1

2021年6月4日 05:08

This question is a continuation of a similar question for linear models instead of Tree-based model. Given that linear models (e.g. lasso, ridge, Linear regression, elastic net, etc.) can't handle missing NaN values and are sensitive to feature scale, what are appropriate approaches to encode or impute missing not at random values in independent features. For example, if I have the following two independent features in my model: CAR_OWNER: Binary features (TRUE/FALSE or 0/1) w/o missing values CAR_COLOR: BLUE, GREEN, …

Topic: linear-models lasso data-imputation linear-regression logistic-regression

Category: Data Science

How to make a linear model with a constant value in R?

mjc

2021年3月30日 17:24

I'm working on an unassessed homework problem from unpublished course notes of a statistics module from a second year university mathematics course. I'm trying to plot a 2-parameter full linear model and a 1-parameter reduced linear model for the same data set. I can't figure out how to plot the 1-parameter model; all attempts so far have either given errors or a non-constant slope. xs <- c(0,1,2) ys <- c(0,1,1) Data <- data.frame(x = xs, y = ys) mf <- …

Topic: linear-models rstudio self-study

Category: Data Science

What is the Intuition behind weight vector W which is normal to the plane? Is the weight vector W same as the W which is normal to the plane π?

Vinit Sutar

2021年3月24日 03:34

In an interview, I was asked the intuition behind the weight vector. I told the weight vector is a vector which we try to minimize to a local minima with the help of regulariser so we don't overfit. Weights tells us the influence of a feature on the model. Although I am not sure if my intuition is correct. Is Weight vector W always normal to the plane? Say we have 5 features and after training say logistic reg we …

Topic: linear-models linear-algebra logistic-regression machine-learning

Category: Data Science

How can I compare a NN model and a linear regression?

Shahab Kazemi

2021年3月4日 22:59

I have a small dataset (1500 rows) and to predict the imbalanced target, I am running two linear models (linear regression and lasso) and one nonlinear model (Neural Network) on it. I am using Area Under Precision Recall Curve (AUPRC) to compare the three models. The baseline in the curve is 10%, AUC for linear regression is 11%, AUC for lasso is 11.2%, and AUC for NN is 11.35%. Can I say that the learning models have improved the random …

Topic: linear-models machine-learning-model neural-network

Category: Data Science

About