Confidence intervals in multivariate linear regression

Question

Confidence intervals in multivariate linear regression

Jsevillamol

2022年5月11日 06:04

I am fitting my data to a multivariate linear regression $Y = BX + \Xi$, where the response is bivariate $Y\in R^{n\times 2}$, and the predictor is uni-variate but elevated to the projective plane to account for the intercept $X\in R^{n\times 2}$.

Now, finding the best fit reduces to $\hat B = (X^T X)^{-1}X^T Y$.

But I am interested in finding a $0.7$ confidence region around $\hat B$. How do I do that?

Topic multivariate-distribution linear-regression regression

Category Data Science

Algo · Accepted Answer · 2022年5月11日 06:04

You could construct a Bayesian linear regression model to find the posterior $p(\theta\mid\mathcal{D})$ (where $\theta$ is the model parameters) and report the credible interval you're interested in, given the dataset $\mathcal{D} := \{ (x_i, y_i) \mid i = 1, 2, .., n \}$, where $x_i \in \mathbb{R}$ and $y_i \in \mathbb{R}^2$.

We will fit one regressor per target (aka two models given that our output is two dimensional)

Linear model forumlation

There are of course many options for choosing the underlying likelihood and priors of our model, but for clarity we will go for simple linear regression with both Gaussian likelihood and prior.

Likelihood: $$ p(y_{ij} \mid x_i ,\theta_j) = \mathcal{N}(\theta_{j_0} + \theta_{j_1}x_i, \sigma_j) $$

Priors:

$$ \theta_j \sim \mathcal{N}(\begin{bmatrix} 0 \\ 0 \end{bmatrix}, I) $$ $$ \sigma \sim \text{HalfNormal}(10) $$

Posterior: $$ p(\theta_j \mid \mathcal{D}) \propto \prod_{i=1}^{n}p(y_{ij} \mid x_i ,\theta_j) \ p(\theta_j)$$

which is the target of your analysis, knowing that you need to report the $0.7$ credible interval of $\theta_j$

If you're using Python, this blog post illustrates how to build Bayesian linear regression model using pymc3.

Brian Spiering · Accepted Answer · 2020年3月23日 14:17

1

Brian Spiering answered at 2020年3月23日 14:17

Bayesian linear regression can provide an estimate for the confidence region for a linear regression estimate.

lcrmorin · Accepted Answer · 2020年1月24日 19:19

Looking at https://en.wikipedia.org/wiki/Simple_linear_regression :

This t-value has a Student's t-distribution with $n-2$ degrees of freedom. Using it we can construct a confidence interval for $\beta$:

$$ \beta \in \left[\widehat\beta - s_{\widehat\beta} t^*_{n - 2},\ \widehat\beta + s_{\widehat\beta} t^*_{n - 2}\right] $$

at confidence level $1-\gamma$, where $t^*_{n - 2}$ is the $(1-\frac{\gamma}{2})$-th quantile of the $t_{n−2}$ distribution.

Confidence intervals in multivariate linear regression

Linear model forumlation

About