derivation for expected value for variance

Hi Im taking a course about probability distribution in datascience and below is derivation of the expected value for the variance

  1. Variance = expected value of the squared difference from mean for any value. But generally, variance is just the difference between the value and its mean.

Why are we squaring and adding the expected value symbol?

$$\sigma^2 = E((Y - \mu)^2) = E(Y^2) - \mu^2$$

  1. For the first step in derivation, why do we multiply the summation of $p(x)$ with $(x - \mu)^2$?

  2. How is this substitution valid? I cannot understand it. I know that $E(X)=p(X).X$

$E(X^2) = \sum P(X)*X^2$

Topic variance distribution probability

Category Data Science


The expected value of a random variable is defined as (without entering into probability/measure theory):

  • For discrete distribution $F$, we have $E(X)=\sum_x P_F(X=x) x$
  • For (absolutely) continuous distribution $F=\int f(x) dx$, we have $E(X) = \int f(x) dx$

Therefore, the first equality is just the definition of expected value for the discrete case.

And finally, the law of the unconscious statistician (LOTUS) states that $$E(g(X)) = \int g(x) f(x) dx$$ (for continuous F), or
$$E(g(X)) = \sum g(x) P(X=x)$$ (for discrete F).

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.