What distribution does the Student t distribution approach as the degrees of freedom become larger?

A student’s t-distribution is a bell-shaped probability distribution symmetrical about its mean. It is regarded as the most suitable distribution to use in the construction of confidence intervals in the following instances:

Índice Show

The Degrees of Freedom
Important Relationships
Considered in terms of the reciprocal of the denominator

When dealing with small samples of less than 30 elements.
When the population variance is unknown.
When the distribution involved is either normal or approximately normal.

Apart from being used in the construction of confidence intervals, a t-distribution is used to test the following:

Single population mean.
The differences between two population means.
The mean difference between paired (dependent) populations.
The population correlation coefficient.

In the absence of explicit normality of a given distribution, a t-distribution may still be appropriate for use if the sample size is large enough for the central limit theorem to be applied. In such a case, the distribution is considered approximately normal.

A t-statistic, also called the t-score, is given by:

$$ t = \cfrac {(x – \mu)}{\left(\cfrac {S}{\sqrt n} \right)} $$

Where:

$x$ = Sample mean.

$μ$ = Population mean.

$S$ = Sample standard deviation.

$n$ = Sample size.

Relationship between the t-distribution and the Normal Distribution

A t-distribution allows us to analyze distributions that are not perfectly normal. A t-distribution has thicker tails relative to a normal distribution.

The shape of a t-distribution is dependent on the number of degrees of freedom. It, therefore, follows that as the number of d.f. increases, the distribution becomes more ‘spiked,’ and its tails become thinner, closer to those of the normal distribution.

The t-distribution has the following properties:

A t-distribution is symmetrical. It is a bell-shaped distribution that assumes the shape of a normal distribution and has a mean of zero.
A t-distribution is defined by one parameter, that is, degrees of freedom (df) $v= n – 1$, where $n$ is the sample size. Its $\text {variance} = \frac {v}{ \left(\frac {v}{2} \right) }$, where $v$ represents the number of degrees of freedom and $v ≥ 2$.
The variance is greater than 1 at all times. Note, however, that it gets very close to one when there are many degrees of freedom. With a large number of degrees of freedom, a t-distribution resembles a normal distribution.
The tails of a t-distribution are fatter and less peaked than those of a normal distribution, indicating more probability in the tails.
The shape of a t-distribution changes with the change in the degrees of freedom. The more the d.f. increase, the more the shape of a t-distribution looks like a standard normal distribution.

The Degrees of Freedom

A t-distribution, just like several other distributions, has only one parameter: the degrees of freedom (d.f.). The number of degrees of freedom refers to the number of independent observations (total number of observations less 1):

$$ v = n-1 $$

Hence, a sample of 10 observations or elements would be analyzed using a t-distribution with 9 degrees of freedom. Similarly, a 6 d.f. distribution would be used for a sample size of 7 observations.

Notations

It is the standard practice for statisticians to use $t_α$ to represent the t-score with a cumulative probability of $(1 – α)$. Therefore, if we were interested in a t-score with a 0.9 cumulative probability, α would be equal to 1 – 0.9 = 0.1. We would denote the statistic as $t_0.1$.

However, the value of $t_α$ depends on the number of degrees of freedom and is often written as $t_{α,n-1}$. For example, we could write $t_{0.05,2}= 2.92$ where the second subscript (2) represents the number of d.f.

Important Relationships

Just like the normal distribution, the t-distribution is symmetrical about the mean. As such,

$$ t_{\alpha}= -t_{1 – \alpha} \text{ and } t_{1 – \alpha} = -t_{\alpha} $$

The table below represents one-tailed confidence intervals and various probabilities for a range of degrees of freedom.

$$ \begin{array}{c|c|c|c|c} \textbf{r} & \textbf{90%} & \textbf{95%} & \textbf{97.5%} & \textbf{99.5%} \\ \hline {1} & {3.07768} & {6.31375} & {12.7062} & {63.6567} \\ \hline {2} & {1.88562} & {2.91999} & {4.30265} & {9.92484} \\ \hline {3} & {1.63774} & {2.35336} & {3.18245} & {5.84091} \\ \hline {4} & {1.53321} & {2.13185} & {2.77645} & {4.60409} \\ \hline {5} & {1.47588} & {2.01505} & {2.57058} & {4.03214} \\ \hline {10} & {1.37218} & {1.81246} & {2.22814} & {3.16927} \\ \hline {30} & {1.31042} & {1.69726} & {2.04227} & {2.75000} \\ \hline {100} & {1.29007} & {1.66023} & {1.98397} & {2.62589} \\ \hline {\infty} & {1.29007} & {1.66023} & {1.98397} & {2.62589} \\ \end{array} $$

Which of the following statements regarding a t-distribution is most likely correct?

A t distribution is less spread out than a standard normal distribution.

A t-distribution is symmetric about zero.

As the degrees-of-freedom increase, a t-distribution becomes wider and flat.

Solution

The correct answer is B.

A t-distribution is a symmetrical, bell-shaped distribution that looks like a normal distribution and has a mean of zero.

A is incorrect. A t-distribution is more spread out than a standard normal distribution.

C is incorrect. As the degrees-of-freedom increase, a t-distribution becomes narrower, taller, and approaches a standard normal distribution.

I'll try to give an intuitive explanation.

The t-statistic* has a numerator and a denominator. For example, the statistic in the one sample t-test is

$$\frac{\bar{x}-\mu_0}{s/\sqrt{n}}$$

*(there are several, but this discussion should hopefully be general enough to cover the ones you are asking about)

Under the assumptions, the numerator has a normal distribution with mean 0 and some unknown standard deviation.

Under the same set of assumptions, the denominator is an estimate of the standard deviation of the distribution of the numerator (the standard error of the statistic on the numerator). It is independent of the numerator. Its square is a chi-square random variable divided by its degrees of freedom (which is also the d.f. of the t-distribution) times $\sigma_\text{numerator}$.

When the degrees of freedom are small, the denominator tends to be fairly right-skew. It has a high chance of being less than its mean, and a relatively good chance of being quite small. At the same time, it also has some chance of being much, much larger than its mean.

Under the assumption of normality, the numerator and denominator are independent. So if we draw randomly from the distribution of this t-statistic we have a normal random number divided by a second randomly* chosen value from a right-skew distribution that's on average around 1.

* without regard to the normal term

Because it's on the denominator, the small values in the distribution of the denominator produce very large t-values. The right-skew in the denominator make the t-statistic heavy-tailed. The right tail of the distribution, when on the denominator makes the t-distribution more sharply peaked than a normal with the same standard deviation as the t.

However, as the degrees of freedom become large, the distribution becomes much more normal-looking and much more "tight" around its mean.

As such, the effect of dividing by the denominator on the shape of the distribution of the numerator reduces as the degrees of freedom increase.

Eventually - as Slutsky's theorem might suggest to us could happen - the effect of the denominator becomes more like dividing by a constant and the distribution of the t-statistic is very close to normal.

Considered in terms of the reciprocal of the denominator

whuber suggested in comments that it might be more illuminating to look at the reciprocal of the denominator. That is, we could write our t-statistics as numerator (normal) times reciprocal-of-denominator (right-skew).

For example, our one-sample-t statistic above would become:

$${\sqrt{n}(\bar{x}-\mu_0)}\cdot{1/s}$$

Now consider the population standard deviation of the original $X_i$, $\sigma_x$. We can multiply and divide by it, like so:

$${\sqrt{n}(\bar{x}-\mu_0)/\sigma_x}\cdot{\sigma_x/s}$$

The first term is standard normal. The second term (the square root of a scaled inverse-chi-squared random variable) then scales that standard normal by values that are either larger or smaller than 1, "spreading it out".

Under the assumption of normality, the two terms in the product are independent. So if we draw randomly from the distribution of this t-statistic we have a normal random number (the first term in the product) times a second randomly-chosen value (without regard to the normal term) from a right-skew distribution that's 'typically' around 1.

When the d.f. are large, the value tends to be very close to 1, but when the df are small, it's quite skew and the spread is large, with the big right tail of this scaling factor making the tail quite fat: