For Logistic regression, why is that particular logistic function chosen as opposed to other logistic functions?

Question

For Logistic regression, why is that particular logistic function chosen as opposed to other logistic functions?

yonasboson

2022年5月23日 10:02

The logistic function used in logistic regression is: $\frac{e^{B_{0} + B_{1}x}}{1 + e^{B_{0} + B_{1}x}}$. Why is this particular one used?

Topic logistic-regression

Category Data Science

Yossi Levy · Accepted Answer · 2022年4月20日 07:02

1

Yossi Levy answered at 2022年4月20日 07:02

You can derive the logistic regression model from the assumption of a latent variable from the logistic distribution see https://sciprincess.wordpress.com/2019/03/01/what-is-logistic-in-the-logistic-regression/

Adam · Accepted Answer · 2022年4月16日 11:23

We require some link function to map some real-valued output $u \in \mathbb{R}$ to $[0,1]$ so that we may interpret it as probabilities. Obviously there are many such functions, but the standard logistic (sometimes called sigmoid) is simple and convenient since its units of scale is log-odds which is easy to interpret. It is also symmetric.

In Economics, we might view $u$ as representing latent utilities, $$ u = f(x;\beta) + \epsilon $$ where $f(x;\beta)$ is some model of observed covariates (e.g. $f(x) = \beta'x)$.

It is common to assume $\epsilon$ is the standard logistic distribution as it is more "robust" than the normal distribution since it has fatter tails. Then we just end up with the canonical sigmoid link function and logit model. If we had assumed some other distribution for $\epsilon$, for instance normal, then we end up with the normal CDF as the link and probit model. If we think it was asymmetric and assumed $\epsilon$ is Gompertz, then we end up with the Extreme Value model.

Different fields use different types of logistic regression to model the problem. For instance in epidemiology, they might use a Richards growth (generalized logistic function) so that infection rates start off small but get exponentially larger.

In computer science/machine learning, for prediction problems we usually don't have an interpretation for $u$ (e.g. output of a neural network) and so we typically just use the standard logistic activation for convenience.

Dave · Accepted Answer · 2022年4月15日 21:15

Generalized linear model operate on the idea that the expected value, conditioned on (or parameterized by) some features, is linearly related to the features once a link function, often called $g$, is applied to the expectation. In math:

$$g(\mathbb E[Y\vert X])=X\beta$$

In logistic regression, we use $g=\log\left(\dfrac{p}{1-p}\right)$. Inverting $g$ to solve for $p$ gives the activation function used in logistic regression.

For Logistic regression, why is that particular logistic function chosen as opposed to other logistic functions?

About