ANOVA (Analysis of Variance)
ANOVA stands for Analysis Of Variance. ANOVA was founded by Ronald Fisher in the year 1918. The name Analysis Of Variance was derived based on the approach in which the method uses the variance to determine the means whether they are different or equal.
It is a statistical method used to test the differences between two or more means. It is used to test general differences rather than specific differences among means. It assesses the significance of one or more factors by comparing the response variable means at different factor levels.
Null hypothesis states that all population means are equal. The alternative hypothesis proves that at least one population mean is different
It provides a way to test various null hypothesis at the same time.
General purpose of ANOVA
The reason for performing ANOVA is to see whether any difference exists between the groups on some variable. Today researchers are using ANOVA in many ways. The usage of ANOVA totally depends on the research design.
You can use t-test to compare the means of two samples but when there are more than two samples to be compared then ANOVA is the best method to be used.
Assumptions of ANOVA
There are four main assumptions are as follows:
- The expected values of the errors are zero
- The variances of all the errors are equal to each other
- The errors are independent
- They are normally distributed
Following are the different types of ANNOVA:
1. One Way between groups
One Way ANOVA is used to check whether there is any significant difference between the means of three or more unrelated groups. It mainly tests the null hypothesis.
H₀: µ₁ = µ₂ = µ₃ = ….. = µₓ
Where µ means group mean and x means number of groups. One Way ANOVA gives a significant result. One way ANOVA is an omnibus test statistic and it will not let you know which specific groups were different from each other. In order to know the specific group or groups which differed from others then you need to do a post hoc test.
Example of One Way ANOVA
20 people are selected to test the effect of five different exercises. 20 people are divided into 4 groups with 5 members each. Their weights are recorded after a few days. The effect of the exercises on the 5 group of men are compared. Here weight is the only one factor.
The dependent variable is normally distributed in each group
There is homogeneity of variances
Independence of observations
2. One Way ANOVA repeated measures
Repeated measures ANOVA is more or less equal to One Way ANOVA but used for complex groupings. Repeated measures investigate about the 1. changes in mean scores over three or more time points
2. differences in mean scores under different conditions.
Example of Repeated measures
You might research the effect of a 6 month exercise programme on weight reducing on some individuals. You calculate the weight at three different point of time during the training period to develop a time-course for any exercise effect.
You might indulge the same individual to eat different type of weight reducing food and rate them as per the taste.
In this example the same set of people are measured more than once on the same dependent variable.
3. Two way between groups
The two way ANOVA compares the mean difference between groups that have been split on two factors. The main objective of a two way ANOVA is to find out if there is any interaction between the two independent variables on the dependent variables. It also lets you know whether the effect of one of your independent variables on the dependent variable is same for all the values of your other independent variable.
The research of the effect of fertilizers on yield of rice. You apply five fertilizers of different quality on five plots of land each cultivating rice. The yield from each plot of land is recorded and the difference between each plot is observed. Here the effect of the fertility of the plots can also be studied. Thus there are two factors, Fertilizer and Fertility.
Before starting with your two way ANOVA your data should pass through six assumptions to make sure that the data you have is sufficient for performing two way ANOVA. The six assumptions are listed below
- Your dependent variable should be measured at the continuous level
- Your two independent variable should contain two or more categorical independent groups for each
- You should have independence of observations
- Avoid any outliers
- Your dependent variable should be normally distributed for each combination of the groups of the two independent variable
- Homogeneity of variances
4. Two way repeated measures
Two way repeated measures the mean differences between the groups that have been split into two within the independent variables. A two way repeated measure is often used in research where a dependent variable is measured more than twice under two or more conditions.
A health researcher wants to find the best way to reduce the chronic joint pain suffered by the people. The researcher selects two different type of treatments to reduce the level of pain. The two types of treatments are known as ‘conditions’. Treatment A is a massage programme and Treatment B is a acupuncture programme. Both the treatments are given to all the patients for 8 weeks.
The patients are tested at three points of time – at the beginning of the programme, at the middle of the programme and at the end of the programme.
The researcher selects 30 patients to take part in the research. But when the first 15 patients undergo Treatment A the other 15 patients undergo Treatment B and vice versa.
At the end of 8 weeks, the researcher uses two way repeated measures ANOVA to find out if there is any change in the pain as a result of the interaction between the type of treatment and at which point of time.
Your data should pass five assumptions that are needed for a two way repeated measures ANOVA to give the exact result.
- Your dependent variable should be measured at the continuous level
- Your two within subject factors should consist of at least two categorical related groups
- There should be no outliers
- The dependent variable should be normally distributed among each combination of the related groups
- The variances of the differences between all combinations of related groups should be equal
Parametric and Non Parametric ANOVA test
If the information about the population is completely known by means of its parameters then the statistical test performed is called Parametric test.
If the information about the population or parameters is not known but still it is required to test the hypothesis, then it is called non parametric test.
When you have categorical data then you cannot use ANOVA method, you need to use Chi square test which deals with ANOVA interaction.
Hypothesis testing procedure – One way ANOVA
- Check any necessary assumption and write null and alternative hypothesis
To perform one way ANOVA certain assumptions should be there. The assumptions are as follows
- Each sample is an independent random sample
- The distribution of the response variable follows a normal distribution
- The population variances are equal across responses for the group levels. It can be found out by dividing the largest sample standard deviation by the smallest sample standard and it is not greater than two then assume that the population variances are equal.
- Calculate an appropriate test statistic
One way ANOVA uses F test statistics. Hand calculations requires many steps to compute the F ratio but statistical software like SPSS will compute the F ratio for you and will produce the ANOVA source table.
ANOVA table will give you information about the variability between groups and within groups. The table will give you all of the formula. Below is the example of a one way ANOVA table
SST means Sum of squares of treatments, SSE means Sum of squares of errors
DFT which is k-1 means degrees of freedom for treatment, DFE which is N-k means Degrees of freedom for errors.
- Determine a p value associated with the test statistic
- Determine between the null and alternative hypothesis
If the null hypothesis is false, then MST should be larger than MSE
- Give a conclusion
Based on your result write a conclusion as per your anova research question.
Multiple comparison tests
If you find that there is a significant difference between the groups that is not related to sampling error then it is necessary to run several t tests to test the means between the groups. There are several tests conducted to control the type one error rate.
- Scheffe’s Test
- Modified Bonferroni test
- Dunnette’s test
- Tukey’s test
ANOVA calculations can be done in three ways – Hand calculations, Excel sheet and SPSS software. Lets learn about all the calculations in detail below
1. ANOVA hand calculations
- Step 1
CM = (Total of all observations)2/NTotal
- Step 2
Compute the total SS
Total SS = Sum of squares of all observations – CM
- Step 3
Compute SST (Sum of Squares for Treatment)
SST = ∑3i=1 T2i/ni – CM
- Step 4
Compute SSE (Sum of Squares for errors)
SSE = SS (Total) – SST
- Step 5
Compute MST, MSE and their ratio F
MST = SST/k-1
MSE = SSE/N-k
F = MST/MSE
2. ANOVA using Excel
To perform a single factor ANOVA in excel follow these simple steps
- Go to Data Tab
- Click Data Analysis
- Select Anova: Single factor and click Ok (there are also other options like Anova: two factor with replication and Anova: two factor without replication)
- Click the Input Range box and select the range
- Click the Output range box and select the output range and click Ok
- You will get the result displayed in the excel sheet
- If F is greater than F crit then the null hypothesis is rejected
3. ANOVA using SPSS
First download the SPSS software to perform the ANOVA. Here we can see how to perform a One way ANOVA using SPSS
SPSS always assumes that the independent variable is represented numerically. In the sample data set, MAJOR is a string. So first convert the string variable into a numerical variable. Once your conversion is over you are ready to do the ANOVA
- Open the SPSS software
- Click Analyze à Compare Means à One Way ANOVA
- One way ANOVA dialog box appears on the screen
- On the left side of the dialog box you will see a list of all the dependent variables that was measured by you. Move it into the Dependent list on the right side by using the upper arrow button
- In the same way move the independent variable in the left side list to the Factor box on the right side.
- Click on the Post Hoc button to select the type of multiple comparison you want to do.
- Select any Post hoc test that suits your research by clicking on the check box next to the test
- Click Continue and it will take you to the One way ANOVA dialog box
- Select any statistics and Click on the check boxes to the left of the option to select it
- Click Means plot to get a anova graph of the means of the conditions
- Click Continue and Click Ok
SPSS output window will appear with six major sections
- Descriptive section
- Test of Homogeneity of Variances
- Multiple Comparisons
- Grade Point Average
Things to be considered when running an ANOVA
Data level and assumptions play a crucial role in ANOVA.
The researcher should find out whether the data is Crossed or Nested. If the data is crossed all groups receive all aspects.
If the data is Nested then each group will receive different ANOVA method.
It is more important to calculate the anova effect size. The effect size can tell you the degree to which the null hypothesis is false. A medium effect size is always preferable
Hope this article gave you a brief overview of ANOVA and interpreting results using it.