Does T-test requires Standard deviation or variance for calculation

Might be a novice question, but the main difference between a t-test and z-test, I was able to understand, is that the z-test calculation requires the SD value of the sample where as in a t-test, we do not have SD, apart from high and low sample size.

But when calculating the t-test value, the formula requires the SD value as well. So what is the difference between a t and z test? Can someone please clear this up?

Topic hypothesis-testing mathematics pvalue statistics

Category Data Science


The z-test requires the population standard deviation.

$$ z=\dfrac{ \bar x -\mu_0 }{\sigma/\sqrt n} $$

You don’t estimate $\sigma$ from the data; you know it. If this sounds unreasonable, you’re right,$^{\dagger}$ so we have the t-test, which uses the sample standard deviation, which is calculated from the data.

$$ t=\dfrac{ \bar x-\mu_0 }{s/\sqrt n}\\ s=\dfrac{1}{n-1}\sum\bigg( x_i-\bar x \bigg)^2 $$

$^{\dagger}$There are situations where this is reasonable, but I would consider them the exception.

We can simulate this to show that the equations give different values.

set.seed(2022);
n <- 31;
true_mean <- 0.2;
mu_0 <- 0;
true_sd <- 1;
x <-rnorm(n, true_mean, true_sd);
z_stat <- (mean(x) - mu_0)/(true_sd/sqrt(n));
t_stat <- (mean(x) - mu_0)/(sd(x)/sqrt(n));
z_stat - t_stat

I get a difference of about $0.05$.

Addressing the question in the title, if you know variance or standard deviation, you know the other by either squaring (to get the variance from the standard deviation) or taking the square root (to get the standard deviation from th e variance).

REFERENCES

https://online.stat.psu.edu/stat200/lesson/8/8.2/8.2.3/8.2.3.3

https://online.stat.psu.edu/stat200/lesson/8/8.2/8.2.3/8.2.3.1


t- test is based on assumption of small samples(less than 30) and utilizes standard error of estimate (S.E) to compute test-statistic for inference. (The standard deviation is divided by square root of n - number of observations in the sample to estimate S.E. of mean.) t-statistic is assumed to be normally distributed and generally utilized in case of small samples of 30 or less. z-test utilizes Z- score. The significance level of Z-score is obtained from standard normal distribution table which is compared against alpha set for a decision. The Z scores are assumed to be distributed under standard normal curve reflecting alpha probabilities. Standard normal distribution invokes large sample theory.

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.