How to compute Hessian matrix for log-likelihood function for Logistic Regression

I am currently studying the Elements of Statistical Learning book. The following equation is in page 120.

It calculates the Hessian matrix for the log-likelihood function as follows

\begin{equation} \dfrac{\partial^2 \ell(\beta)}{\partial\beta\partial\beta^T} = -\sum_{i=1}^{N}{x_ix_i^Tp(x_i;\beta)(1-p(x_i;\beta))} \end{equation}

But is the following calculation it is only calculating $\dfrac{\partial^2\ell(\beta)}{\partial\beta_i^2}$ terms. But Hessian matrix should also contain $\dfrac{\partial^2\ell(\beta)}{\partial\beta_i\partial\beta_j}$ where $i\neq j$.

Please explain the reason for missing out these terms.

Topic matrix esl mathematics logistic-regression statistics

Category Data Science


Beta is a vector of parameters, therefore:

$ \frac{\delta l(\beta)}{\delta\beta}= [\frac{\delta l(\beta)}{\delta\beta_1}\quad\frac{\delta l(\beta)}{\delta\beta_2}\quad\frac{\delta l(\beta)}{\delta\beta_3}\quad...\quad\frac{\delta l(\beta)}{\delta\beta_n}]$ and so

$ \frac{\delta(\frac{\delta l(\beta)}{\delta\beta})}{\delta\beta^{T}}= \begin{bmatrix} \frac{\delta l^2(\beta)}{\delta\beta_1^2} & \frac{\delta l^2(\beta)}{\delta\beta_1\delta\beta_2} & ... & \frac{\delta l^2(\beta)}{\delta\beta_1\delta\beta_n} \\ \frac{\delta l^2(\beta)}{\delta\beta_2\delta\beta_1} & \frac{\delta l^2(\beta)}{\delta\beta_2^2} & ... & \frac{\delta l^2(\beta)}{\delta\beta_2\delta\beta_n} \\ \vdots & \vdots & \ddots & \vdots \\ \frac{\delta l^2(\beta)}{\delta\beta_n\delta\beta_1} & \frac{\delta l^2(\beta)}{\delta\beta_n\delta\beta_2} & ... & \frac{\delta l^2(\beta)}{\delta\beta_n^2} \end{bmatrix}$, which is your Hessian.

The term on the right side of your equation is also a matrix, because there is a multiplication of vectors in it: $x_i \cdot x_i^T$, which gives a $n \times n$ matrix.

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.