Parametric Tests — the t-test (2024)

Hypothesis Testing

Parametric Tests — the t-test (3)

In my previous article we went through the whats, hows and whys of hypothesis testing with a brief introduction on statistical tests and the role that they play in helping us determine statistical significance. In this article and the coming few, we’ll take a deeper look at statistical tests — the different types of tests, the tests themselves and which test should be used for which situation.

As mentioned before, statistical tests are statistical methods that help us reject or not reject our null hypothesis. They’re based on probability distributions and can be one-tailed or two-tailed, depending on the hypotheses that we’ve chosen.

There are other ways in which statistical tests can differ and one of them is based on their assumptions of the probability distribution that the data in question follows.

  • Parametric tests are those statistical tests that assume the data approximately follows a normal distribution, amongst other assumptions (examples include z-test, t-test, ANOVA). Important note — the assumption is that the data of the whole population follows a normal distribution, not the sample data that you’re working with.
  • Nonparametric tests are those statistical tests that don’t assume anything about the distribution followed by the data, and hence are also known as distribution free tests (examples include Chi-square, Mann-Whitney U). Nonparametric tests are based on the ranks held by different data points.

Every parametric test has a nonparametric equivalent, which means for every type of problem that you have there’ll be a test in both categories to help you out.

Parametric Tests — the t-test (4)

The selection of which set of tests is apt for the problem at hand is not this black and white, though. If your data doesn’t follow a normal distribution, nonparametric tests are not necessarily the right pick. The decision is dependent on other factors such as sample size, the type of data you have, what measure of central tendency best represents the data, etc. Certain parametric tests can perform well on non normal data if the sample size is large enough — for example, if your sample size is greater than 20 and your data is not normal, a one-sample t-test will still benefit you. But, if the median better represents your data then you’re better off with a nonparametric test.

In this article, we will be looking at parametric tests — particularly the t-test.

Parametric tests are those that assume that the sample data comes from a population that follows a probability distribution — the normal distribution — with a fixed set of parameters.

Common parametric tests are focused on analyzing and comparing the mean or variance of data.

The mean is the most commonly used measure of central tendency to describe data, however it is also heavily impacted by outliers. Thus it is important to analyze your data and determine whether the mean is the best way to represent it. If yes, then parametric tests are the way to go! If not, and the median better represents your data, then nonparametric tests might be the better option.

As mentioned above, parametric tests have a couple of assumptions that need to be met by the data:

  1. Normality — the sample data come from a population that approximately follows a normal distribution
  2. hom*ogeneity of variance — the sample data come from a population with the same variance
  3. Independence — the sample data consists of independent observations and are sampled randomly
  4. Outliers — the sample data don’t contain any extreme outliers

Before we get into the different statistical tests, there is one important concept that should be discussed — degrees of freedom.

The degrees of freedom are essentially the number of independent values that can vary in a set of data while measuring statistical parameters.

Let’s say you like to go out every Saturday and you’ve just bought four new outfits. You want to wear a new outfit every weekend of the month. On the first Saturday, all four outfits are unworn, so you can pick any. The next Saturday you can pick from three and the third Saturday you can pick from two. On the last Saturday of the month though, you’re left with only one outfit and you have to wear it whether you want to or not, whereas on the other Saturdays you had a choice.

So basically, you had 4–1=3 Saturdays of freedom to choose an outfit — your outfit could vary.

That’s the idea behind degrees of freedom.

With respect to numerical values and the mean, the sum of the numerical values must equal the sample size times the mean, i.e. sum = n * mean, where n is the sample size. So if you have a sample size of 20 and a mean of 40, the sum of of all the observations in the sample must be 800. The first 19 values can be anything, but the 20th value has to ensure that the total of all the values adds up to 800, therefore it has no freedom to vary. Hence the degrees of freedom are 19.

The formula for degrees of freedom is sample size — number of parameters you’re measuring.

If you want to compare the means of two groups then the right tests to choose between are the z-test and the t-test.

One-sample (one-sample z-test or a one-sample t-test): one group will be a sample and the second group will be the population. So you’re basically comparing a sample with a standard value from the population. We are basically trying to see if the sample comes from the population, i.e. does it behave differently from the population or not.

An example of this is the one we discussed in the previous article — the mean age of patients known to visit a dentist is 18, but we hypothesize it could be greater than this. The sample must be randomly selected from the population and the observations must be independent of one another.

Two-sample (two-sample z-test and a two-sample t-test): both groups will be separate samples. As in the case of one-sample tests, both samples must be randomly selected from the population and the observations must be independent of one another.

Two-sample tests are used when there are two variables involved. For example, comparing the mean money spent on a shopping site between the two sexes. One sample will be female customers and the second sample will be male customers. Since the means are being compared, one of the variables involved in the test has to be numerical (the money spent on a shopping site is the numerical variable).

Important note: don’t confuse one-sample and two-sample with one-tailed and two-tailed! The former is related to the number of samples being compared and the latter with whether your alternate hypothesis is directional. You can have a one-sample two-tailed test.

How do we choose between a z-test and a t-test though? By looking at the sample size and population variance.

  • If the population variance is known and the sample size is large (greater than or equal to 30) — we choose a z-test
  • If the population variance is known and the sample size is small (less than 30) — we can perform either a z-test or a t-test
  • If the population variance is not known and the sample size is small — we choose a t-test
  • If the population variance is not known and the sample size is large — we choose a t-test
Parametric Tests — the t-test (5)

As mentioned above, the t-test is very similar to the z-test, barring the fact that it works well with smaller samples and the population variance doesn’t need to be known.

The t-test is based on the t-distribution, which is a bell-shaped curve like the normal distribution, but has heavier tails.

As the sample size increases, the degrees of freedom also increase, and the t-distribution becomes similar to the normal distribution. It becomes less skewed and tighter around the mean (lighter tails). Why? We’ll find out in a bit.

There are three types of t-tests. Introductions for two have already been given above — one-sample and two-sample. Both of these come under the ‘unpaired t-test’ umbrella, and so the third type of t-test is the ‘paired t-test’.

The concept of paired and unpaired is to do with the samples. Is the sample the same or are they two different samples? Are we monitoring a variable in two different groups or the same group? If the sample is the same, then the t-test should be paired, else unpaired.

For example, let’s say you want to test whether a certain medication increases the level of progesterone in women.

If the data you have is the progesterone levels of a group of women before the medication was consumed and the progesterone levels of the same group of women after the medication was consumed, then you would conduct a paired t-test since the sample is the same.

If the data you have is the progesterone level of two groups of women of different age groups after the medication was consumed, then you would conduct a two-sample unpaired t-test since there are two different samples.

Every statistical test has a test statistic which helps us calculate the p-value which then determines whether to reject or not reject the null hypothesis. In the case of the t-test, the test statistic is known as the t-statistic. The formula to calculate the t-statistic differs depending on which t-test you’re performing, so let’s take a closer look at them all.

The code and data used in all the below examples can be found here.

One-sample t-test.

The average height of women in India was recorded to be 158.5cm. Is the average height of women in India today greater than 158.5cm?

To test this hypothesis I asked 25 women their height.

My hypotheses are —

Parametric Tests — the t-test (6)
  • The significance level is 0.05.
  • The sample mean is 162cm and sample standard deviation is 2.4cm.
  • Since the sample size is 25, the degrees of freedom will be 24 (25–1).
  • Since I’m comparing a sample mean with a population mean (standard value), this will be a one-sample test.
  • Since my hypothesis has a direction — the average sample height is greater than the average population height — this will be a one-tailed test.

The formula to calculate the t-statistic is:

Parametric Tests — the t-test (7)

So the t-statistic in our case will be

Parametric Tests — the t-test (8)

Next we need to look up the critical value of the t-distribution where alpha is 0.05 and the degrees of freedom are 24 in the table for t-statistic values. The critical value for our scenario is 1.711. Our t-statistic is greater than the critical value, so we can reject the null hypothesis and conclude that the mean height of women in India is greater than 158.5 cm!

While it is better to calculate the p-value in hypothesis testing to reject or not reject the null hypothesis, the formula to calculate the p-value for a t-statistic is a bit tricky. You can either work with the t-distribution table values or simply use the critical value to reject or not reject the null hypothesis when performing hypothesis testing manually. Otherwise using a calculator or python functions will help you get the p-value. Let’s see how!

We’ll start off by reading our csv into a dataframe:

import pandas as pd
data = pd.read_csv("one-sample-t.csv")
data.head()
Parametric Tests — the t-test (9)

We have two columns — age and height. For this one-sample t-test, we only need height since we are comparing the mean height of this sample with the population mean — 158.5cm.

Let’s check the mean and standard deviation of the height column:

data.height.mean()
>> 162.053526834564
data.height.std()
>> 2.4128517286126048

The assumptions of a t-test state that the sample data must come from a normal distribution. We can check if the height column is normally distributed or not by using a Probability Plot(also known as a QQ plot — Quantile-Quantile plot). In brief, a probability plot is a graphical method to check if a data set follows a particular distribution. It is essentially a plot of two data sets — one is the data whose distribution you want to check, and the other is data from that distribution itself. In our case, one set of data will be the height column and the distribution will be the normal distribution.

import pylab
stats.probplot(data.height, dist="norm", plot=pylab)
pylab.show()
Parametric Tests — the t-test (10)

The red line represents the normal distribution and the blue dots represent our data of the height column. The graph above confirms that the height column comes from / follows a normal distribution since the height data points follow the path of the normal distribution line.

Now, we will perform the one-sample t-test using scipy’s stats method. We need to pass it our data and the population mean:

stats.ttest_1samp(data.height,popmean=158.5)
>> Ttest_1sampResult(statistic=7.363748862859639, pvalue=1.32483697812078e-07)

The p-value is ridiculously small! So we can reject the null hypothesis.

Two-sample t-test.

Is there a relationship between age and height of women in India?

To test this hypothesis I asked 50 women their age — 25 women are between 27 and 30 years of age (group A), 25 women are between 37 and 40 years of age (group B).

  • My hypotheses are —
Parametric Tests — the t-test (11)
  • The significance level is 0.05.
  • The sample mean and standard deviation for group A are 162cm and 2.4cm respectively.
  • The sample mean and standard deviation for group B are 158.6cm and 3.4cm respectively.
  • Since I’m comparing the means of two samples, this will be a two-sample test.
  • Since my hypothesis is nondirectional, this will be a two-tailed test.

It was mentioned earlier that parametric tests assume hom*ogeneity of variance, i.e. the variance of both the samples should be the same. In the example mentioned here, the variance is definitely not the same — the standard deviation of group A is 2.4cm whereas it’s 3.4cm for group B. Does this mean we can’t perform a two-sample t-test? No, it doesn’t! Thankfully, there’s a variation of the t-test that allows for different variances and it’s called Welch’s t-test.

When the variance of both samples is equal, the denominator used in calculating the t-statistic is known as the pooled variance. If the sample sizes of both groups is different then the formula is:

Parametric Tests — the t-test (12)

If the sample sizes of both groups is equal then the formula is simply:

Parametric Tests — the t-test (13)

It finds the common variance of the two groups to be used in the t-statistic formula. The formula for the t-statistic is:

Parametric Tests — the t-test (14)

However, when the variance of both samples is not equal, the denominator compares both variances and the formula to calculate the t-statistic is:

Parametric Tests — the t-test (15)

Furthermore, the calculation of the degrees of freedom also differs between the two tests. If the variance of both groups in the current example were equal, the degrees of freedom would be 48 (25+25–2; we subtract 2 because we are measuring two parameters — the means of each sample).

In the case of Welch’s t-test, the degrees of freedom are fractional, always smaller than the degrees of freedom of Student’s t-test, and frankly a bit complicated to calculate.

Since our variances are not equal, we will be performing Welch’s t-test.

So the t-statistic in our case will be:

Parametric Tests — the t-test (16)

Let’s do this in python too.

import pandas as pd
df_a = pd.read_csv("one-sample-t.csv") # group A
df_b = pd.read_csv("two-sample-t.csv") # group B

Group A is the same csv we used for the one-sample t-test, so we already know its mean and standard deviation. Let’s check the same for Group B.

df_b.height.mean()
>> 158.60704061997612
df_b.height.std()
>> 3.42443022417948

Now we perform the t-test! To perform Welch’s t-test we simply need to pass the equal_var parameter as False. By default it is true so if we were performing Student’s t-test we needn’t pass it at all.

stats.ttest_ind(df_a.height, df_b.height, equal_var=False)
>> Ttest_indResult(statistic=4.113633648976651, pvalue=0.00017195968508873518)

The p-value is much smaller than 0.05 hence we can reject our null hypothesis.

Paired t-test.

Does nutritional drink xyz increase the height of women?

To test this hypothesis I measured the height of 25 women before they began the course of the nutritional drink and then after they completed the course.

  • My hypotheses are -
Parametric Tests — the t-test (17)
  • The significance level is 0.05.
  • The sample mean and standard deviation for the women before the drink are 162cm and 2.4cm respectively.
  • The sample mean and standard deviation for the women after the drink are 167cm and 3.4cm respectively.
  • Since the sample size is 25, the degrees of freedom will be 24 (25–1).
  • Since I’m comparing the means of the same sample but with an intervention in between, this will be a paired t-test.
  • Since my hypothesis is directional, this will be a one-tailed test.

Fun fact — a paired t-test calculates the differences between the paired observations in the two sets of data (same sample, before and after) and then performs a one-sample t-test with the mean difference and mean standard deviation.

Let’s implement this directly in python:

Read the csv into a dataframe.

import pandas as pd
data = pd.read_csv("paired-t.csv")
data.head()

Use the describe() method to check the mean and standard deviations of both the before and after columns.

data.describe()
Parametric Tests — the t-test (18)

Perform the paired t-test using scipy stat’s ttest_rel method! We pass ‘greater’ in the alternative parameter since our alternate hypothesis is that the mean height after the nutritional drink will be greater than the height before the drink.

stats.ttest_rel(data.height_before, data.height_after, alternative='greater')
>> Ttest_relResult(statistic=-1.9094338173992416, pvalue=0.9658844528005113)

The p-value is greater than 0.05, hence we can’t reject our null hypothesis.

Now that we’ve seen all the types of t-tests and their formulae for calculating the t-statistic, we can understand why as the sample size increases the t-distribution becomes similar to the normal distribution. All the different t-tests involve the sample’s standard deviation / variance. This is simply an estimation of the population’s variance since that is unknown. Since the assumption in a t-test is that the sample data is from a population which follows the normal distribution, as the sample size increases and hence the degrees of freedom increase, there is a higher chance of this estimation of variance to actually be correct, i.e. to be the population’s variance. Additionally, the larger the sample size, the closer it is to being the population. Since the population is normally distributed, it makes sense that the t-distribution for this larger sample size with a higher number of degrees of freedom also resembles a normal distribution.

Parametric Tests — the t-test (2024)

FAQs

Is the t-test a parametric test? ›

t-tests are parametric tests, which assume that the underlying distribution of the variable of interest is normally distributed. Consider the two-sample t-test. It is fairly robust to deviations from normality [4], and—by the central limit theorem—increasingly so when the sample size increases.

Are parametric tests more accurate? ›

Parametric methods generally produce more accurate estimates, i.e., they have more statistical power, where the power of a statistical test is the probability that it correctly rejects the null hypothesis when the null hypothesis is false.

When not to use a parametric test? ›

If you don't meet the sample size guidelines for the parametric tests and you are not confident that you have normally distributed data, you should use a non-parametric test or even a permutation-based test (see a statistician!).

What is nonparametric vs parametric t-test? ›

Parametric tests are those that make assumptions about the parameters of the population distribution from which the sample is drawn. This is often the assumption that the population data are normally distributed. Non-parametric tests are “distribution-free” and, as such, can be used for non-Normal variables.

Are the t-test and z-test parametric? ›

The t and z tests are known as parametric because the assumption is made that the samples are normally distributed.

What are the disadvantages of parametric tests? ›

However, the main disadvantage of parametric tests is that they are sensitive to violations of the assumptions, such as normality, hom*oscedasticity, or independence. If your data do not meet these assumptions, your results may be inaccurate or misleading.

What are three reasons to use parametric tests? ›

A parametric test makes assumptions about a population's parameters:
  • Normality : Data in each group should be normally distributed.
  • Independence : Data in each group should be sampled randomly and independently.
  • No outliers : No extreme outliers in the data.
Mar 2, 2023

How do you know when to use a parametric test? ›

If the mean more accurately represents the center of the distribution of your data, and your sample size is large enough, use a parametric test. If the median more accurately represents the center of the distribution of your data, use a nonparametric test even if you have a large sample size.

How do I know if my data is parametric or nonparametric? ›

Parametric statistics are based on assumptions about the distribution of population from which the sample was taken. Nonparametric statistics are not based on assumptions, that is, the data can be collected from a sample that does not follow a specific distribution.

What sample size is needed for a parametric test? ›

Reasons to Use Parametric Tests
Parametric analysesSample size guidelines for nonnormal data
1-sample t testGreater than 20
2-sample t testEach group should be greater than 15
One-Way ANOVAIf you have 2-9 groups, each group should be greater than 15. If you have 10-12 groups, each group should be greater than 20.
Feb 19, 2015

What are the four basic assumptions of most parametric tests? ›

There are four basic assumptions of most parametric tests:
  • Interval or ratio (i.e., continuous) dependent variable.
  • Independent scores on the dependent variable.
  • Normal distribution.
  • hom*ogeneity of variances.

Is the t-test parametric? ›

As the t test is a parametric test, samples should meet certain preconditions, such as normality, equal variances and independence.

Why are non-parametric tests better? ›

Nonparametric statistical techniques have the following advantages: - There is less of a possibility to reach incorrect conclusions because assumptions about the population are unnecessary. In other words, this is a conservative method. - It is more intuitive and does not require much statistical knowledge.

Can I use both parametric and nonparametric tests? ›

If it is normally distributed, then use a stringent approach, by using parametric tests. However, it is better to also do non-parametric tests, because they do not make modelling assumptions, and if you get similar results, good enough, report your approach in the methods section.

What are the types of parametric tests? ›

Parametric tests are those tests for which we have prior knowledge of the population distribution (i.e, normal), or if not then we can easily approximate it to a normal distribution which is possible with the help of the Central Limit Theorem. Parameters for using the normal distribution is: Mean. Standard Deviation.

Which version of t-test is non-parametric? ›

The nonparametric version of the independent-samples t-test is known as the Mann-Whitney U-Test. The nonparametric version of the paired-samples t-test is known as the Wilcoxon Signed-Rank Test.

Is the f test parametric or nonparametric? ›

The F-test is a parametric test that helps the researcher draw out an inference about the data that is drawn from a particular population. The F-test is called a parametric test because of the presence of parameters in the F- test.

How to determine if data is parametric or nonparametric? ›

Parametric tests are suitable for normally distributed data. Nonparametric tests are suitable for any continuous data, based on ranks of the data values. Because of this, nonparametric tests are independent of the scale and the distribution of the data.

References

Top Articles
Latest Posts
Article information

Author: Jerrold Considine

Last Updated:

Views: 6003

Rating: 4.8 / 5 (58 voted)

Reviews: 81% of readers found this page helpful

Author information

Name: Jerrold Considine

Birthday: 1993-11-03

Address: Suite 447 3463 Marybelle Circles, New Marlin, AL 20765

Phone: +5816749283868

Job: Sales Executive

Hobby: Air sports, Sand art, Electronics, LARPing, Baseball, Book restoration, Puzzles

Introduction: My name is Jerrold Considine, I am a combative, cheerful, encouraging, happy, enthusiastic, funny, kind person who loves writing and wants to share my knowledge and understanding with you.