linear-algebra

Neural Network for solving these linear algebra problems

user135222

2022年5月1日 15:23

Intro There are several questions on this site about whether or not machine learning can solve specific problems. The answer (in my words) seems to be: "Yes, trivially, if you choose a model to learn your specific problem, but you sometimes may choose a model that can't represent/approximate the correct hypothesis." I would like to choose a neural network model where, a priori, all I know is that the input is a "linear algebra" kind of function. The Problem I …

Topic: linear-algebra machine-learning-model model-selection neural-network

Category: Data Science

Hypothesis vs Hyperplane in Machine Learning

AnonymousMe

2022年4月15日 19:01

I am finding it hard to understand the clear difference between Hypothesis and Hyperplane. I know that Hypothesis is a candidate model that maps inputs to outputs after training. And, Hyperplane is the decision boundary in a classification algorithm. But, I can't seem to understand how the two are differentiated in equations. Can someone help me understand their differences in equations with some visualizations?

Topic: linear-algebra machine-learning-model training statistics machine-learning

Category: Data Science

How to incorporate the uncertainty of the model coefficients in the prediction interval of a multiple linear regression

DannyVanpoucke

2022年4月3日 14:04

I'm dealing with modeling small experimental data sets. As most experimental work does not generate thousands of samples, but rather a handful, I need to be inventive about how to deal with this small number of data sets (say 10-20). I've been building a nice framework to do just this, and at this point, I am interested in generating error bars with the predicted values. In a rough outline, this is what happens in the framework (e.g. when applying a …

Topic: linear-algebra prediction statistics predictive-modeling

Category: Data Science

Can I use regression to solve a multiple equation problem

HHH

2022年4月3日 12:04

I'm working on a problem which is a multiple equation. I have a group of people and each person in the group is working on different tasks (e.g. n tasks in total). Each person in this group is working on multiple tasks and complete them. I'd like to find an estimation for the time each type of task takes. I have equations like below: #of days person i worked = time(task1) * #task of type 1 completed + time(task2) * …

Topic: linear-algebra linear-regression regression

Category: Data Science

Dot product and linear regression

leoperassoli

2022年3月21日 06:04

I'm studying PCA and my professor said something about finding the linear regression by doing the dot product of both axis. Could someone explain to me why? The dot product returns a number. What's the relationship between that number and the linear regression? In my example, I have two vectors $stat\_grade = [0,1,3,7,10]$ $physics\_grade = [1,5,8,10,10]$ The first step is normalizing them: $ \frac{stat\_grade - mean(stat\_grade)}{std(stat\_grade)} = [-1.69131435 -0.52489066 0.34992711 0.93313895 0.93313895]$ $ \frac{physics\_grade - mean(physics\_grade)}{std(physics\_grade)} = [-1.11613741 -0.85039041 -0.3188964 …

Topic: linear-algebra pca linear-regression dimensionality-reduction

Category: Data Science

Gradient descent formula implementation in python

Manas Tripathi

2022年3月18日 23:07

So I recently started with Andrew Ng's ML Course and this is the formula that Andrew lays out for calculating gradient descent on a linear model. $$ \theta_j = \theta_j - \alpha \frac{1}{m} \sum_{i=1}^m \left( h_\theta(x^{(i)}) - y^{(i)}\right)x_j^{(i)} \qquad \text{simultaneously update } \theta_j \text{ for all } j$$ As we see, the formula asks us to the sum over all the rows in data. However, the below code doesn't work if I apply np.sum() def gradientDescent(X, y, theta, alpha, num_iters): …

Topic: linear-algebra linear-regression

Category: Data Science

Linear regression with a fixed intercept and everything is in log

a0142204

2022年3月5日 14:05

I have a set of values for a surface (in pixels) that becomes bigger over time (exponentially). The surface consists of cells that divide over time. After doing some modelling, I came up with the following formula: $$S(t)=S_{initial}2^{t/a_d},$$ where $a_d$ is the age at which the cell divides. $S_{initial}$ is known. I am trying to estimate $a_d$. I simply tried the $\chi^2$ test: # Range of ages of division. a_range = np.linspace(1, 500, 100) # Set up an empty vector …

Topic: chi-square-test structural-equation-modelling linear-algebra python

Category: Data Science

Backpropagation with a different sized training set?

Spinach

2022年3月3日 17:02

I'm trying to create a NN whose input is a (length m) array of 3d vectors $$\vec{x}_i = [x_{i,1},x_{i,2},x_{i,3}], \hspace{5mm}i=1:m $$ and whose output is a similarly sized array: $$\vec{h}_{\theta,i} = [h_{\theta,i1},h_{\theta,i2},h_{\theta,i3}], \hspace{5mm}i=1:m $$ BUT, my only training data is not 3d vectors but rather the magnitude/norm of such vectors (with no knowledge of the vector components ($\lambda's$) themselves): $$y_i= ||[\lambda_{i,1},\lambda_{i,2},\lambda_{i,3}]||, \hspace{5mm}i=1:m $$ So, my concept is to use the cost function: $$ J = \frac{1}{2m}\sum (||\vec{h}_{\theta,i}|| - ||y_i||)^2 $$ …

Topic: linear-algebra backpropagation neural-network machine-learning

Category: Data Science

Nearest neighbor face recognition in eigenspace when using dot product of test set with eigenvectors does not match the performance when using sklearn

zr0gravity7

2022年3月1日 22:18

I am trying to perform Face recognition using PCA (eigenfaces). I have a set of N training images (of dimensions M=wxh), which I have pre-processed into a vertical stack of grayscale intensity vectors, a matrix of dimensions NxM. For the facial recognition, I am finding the single nearest neighbour of each test image in both the high-dimensional pixel space and the lower dimensional eigenspace. I am using NearestNeighbor classifier from sklearn. For recognition in the eigenspace, I am contrasting different …

Topic: k-nn linear-algebra pca computer-vision dimensionality-reduction

Category: Data Science

A support vector machine for separating pluses from minus finds a support vector at point (1,0) and a minus support vector at x2=(0,1)

Pole_Star

2022年3月1日 10:16

Suppose a support vector machine for separating pluses from minus finds a support vector at point (1,0) and a minus support vector at x2=(0,1). Determine the values of w and b.

Topic: linear-algebra svm

Category: Data Science

If an SVM decision boundary is the perpendicular bisector of the line connecting the support vectors, why iterate for it using a loss function?

user132380

2022年2月14日 20:26

Would it not make more sense to do some linear algebra to find the vector of the decision boundary? Is that more computationally expensive?

Topic: linear-algebra svm machine-learning

Category: Data Science

Why linear model cannot understand the interaction between any two input features?

Gull Noor

2022年2月7日 17:38

The book Deep Learning by Ian Goodfellow states that: Linear models also have the obvious defect that the model capacity is limited to linear functions, so the model cannot understand the interaction between any two input variables. What is meant by "interaction between variables" How do non linear models find it? Would be great if someone can give an intuitive/graphical/geometrical explanation.

Topic: linear-algebra deep-learning

Category: Data Science

NCHW input matrix to Dm conversion logic for convolution in cuDNN

Rajesh Shashi Kumar

2022年2月4日 15:37

I have been trying to understand the convolution lowering operation shown in the cuDNN paper. I was able to understand most of it by reading through and mapping various parameters to the image below. However, I am unable to understand how the original input data (NCHW) was converted into the Dm matrix shown in red. The ordering of the elements of the Dm matrix does not make sense. Can someone please explain this?

Topic: cuda linear-algebra convolution

Category: Data Science

Pseudo inverse of the covariance matrix?

nimo96

2022年1月4日 16:49

I've been looking for methods to compute a pseudo inverse of a covariance matrix. And found that one way is to construct a regularized inverse matrix. By constructing the eigen system, and removing the least significant eigenvalues and then use the eigen values and vectors to form an approximate inverse. Could anyone explain the idea behind this? Thanks in advance

Topic: linear-algebra

Category: Data Science

Why transpose of independent feature matrix is necessary in case of linear regression?

Fredrik

2021年12月15日 22:51

I can follow classical linear regression steps: $Xw=y$ $X^{-1}Xw=X^{-1}y$ $Iw=X^{-1}y$ $w=X^{-1}y$ However, on implementing in Python, I see that instead of simply using w = inv(X).dot(y) they apply w = inv(X.T.dot(X)).dot(X.T).dot(y) What is the explanation of the transpositions and the two times multiplication here? I'm confused...

Topic: linear-algebra linear-regression

Category: Data Science

How is image convolution actually implemented in deep learning libraries using simple linear algebra?

Jozef Nagy

2021年12月10日 11:39

As a clarifier, I want to implement cross-correlation, but the machine learning literature keeps referring to it as convolution so I will stick with it. I am trying to implement image convolution using linear algebra. After looking around on the internet and thinking about it, I could come up with two possible solutions for that. The first one: Create an appropriate Toeplitz-like matrix out of the kernel as it is described here. The second one: Instead of the filter, modify …

Topic: matrix linear-algebra convolutional-neural-network convolution

Category: Data Science

Difference between FDA and LDA

Nestroy

2021年12月5日 21:02

I have asked this question in Mathematics Stackexchange, thought however that it might be more fit for here: I am currently taking a Data-Analysis course and I learned about both the terms LDA (Linear Discriminant Analysis) and FDA (Fisher's Discriminant Analysis). I almost have the feeling that they are used as somewhat of synonyms in some places, which obviously is not true. Can someone explain me how those approaches are related? Since LDA's aim is to reduce dimensionality while preserving …

Topic: linear-algebra discriminant-analysis

Category: Data Science

Is Regression Line an 1-D affine subspace of 2-D vector space?

Rakka Alhazimi

2021年11月10日 15:47

Background I currently read a book called "Mathematics for Machine Learning" and I read chapter 2 which is about Linear Algebra, especially on subchapter 2.8 which is about Affine Space. The thing is, I learned from the book that affine subspaces are points, lines, and plane in $ \mathbb{R}^{3} $, which don't (necessarily) go through the origin. The affine subspace is defined as $$ L = x_{0} + \lambda b_{1} $$ where: $L$ is affine subspace $x_{0}$ is a support …

Topic: mathematics linear-algebra linear-regression machine-learning

Category: Data Science

Understanding Lagrangian equation for SVM

Maha

2021年9月23日 10:58

I was trying to understand Lagrangian from SVM section of Andrew Ng's Stanford CS229 course notes. On page 17 and 18, he says: Given the problem $$\begin{align} min_w & \quad f(w) \\ s.t. & \quad h_i(w)=0, i=1,...,l \end{align}$$, the Lagrangian can be given as follows: $$\mathcal{L}(w,\beta)=f(w)\color{red}{+}\sum_{i=1}^l\beta_ih_i(w)\quad\quad\quad \text{...equation(1)}$$ Here, the $\beta_i$'s are Lagrange multipliers. While referring to Lagrange multipliers from Khan academy aryicle, I found it says: Lagrangian is given as: $$ \mathcal{L}(x,y,…,λ)=f(x,y,…)\color{red}{−}λ(g(x,y,…)−c) \quad\quad\quad \text{...equation(2)}$$ Here, $g$ is a constraint and …

Topic: linear-algebra optimization classification svm machine-learning

Category: Data Science

Understanding SVM mathematics

Rnj

2021年9月17日 11:29

I was referring SVM section of Andrew Ng's course notes for Stanford CS229 Machine Learning course. On pages 14 and 15, he says: Consider the picture below: How can we find the value of $\gamma^{(i)}$? Well, $w/\Vert w\Vert$ is a unit-length vector pointing in the same direction as $w$. Since, point $A$ represents $x^{(i)}$, we therefore find that the point $B$ is given by $x^{(i)} − \gamma^{(i)}·w/\Vert w\Vert$. But this point lies on the decision boundary, and all points $x$ …

Topic: linear-algebra classification svm machine-learning

Category: Data Science

About