Correlation Coefficient Formula (Table of Contents)
What is the Correlation Coefficient Formula?
In statistics, certain outcomes have a direct relation to other situations or variables, and the correlation coefficient is the measure of that direct association of two variables or situations. These variables exhibit a positive correlation coefficient when they move in the same direction at the same time. Similarly, if they move in a different and opposite direction, they are said to have a negative correlation coefficient. For example, if the market’s interest rate falls down, the corporate loans will be cheaper, and the economy will boost. So the interest rate and growth of the economy have a positive correlation coefficient. The value of the correlation coefficient defines the strength of the relationship between variables. The maximum value of the correlation coefficient varied from +1 to -1. If the correlation coefficient is +1, then the variables are perfectly positively correlated, and if that value is -1, then it is called perfectly negatively correlated.
Suppose that we have 2 sets of data given by X (X1, X2 … Xn) and Y (Y1, Y2 … Yn).
Formula For the Correlation Coefficient is given by:
- X – Data points in Data set X
- Y – Data points in Data set Y
- Xm – Mean of Data set X
- Ym – Mean of Data set Y
This formula seems to be very time consuming and confusing at first.
Another way to calculate the correlation coefficient is by using the CORREL () function in excel. I will explain both the Correlation Coefficient formulas by using examples.
Examples of Correlation Coefficient Formula (With Excel Template)
Let’s take an example to understand the calculation of the Correlation Coefficient in a better manner.
Correlation Coefficient Formula – Example #1
Let’s say we have two data sets, X & Y, and each contains 20 random data points. First, calculate the Correlation Coefficient for the data set X & Y.
Mean is calculated as:
- Mean of Data Set X = 15.6
- Mean of Data Set Y = 13.8
Now, we need to calculate the difference between the data points and the mean value.
Similarly, calculate for all values of the data set X.
Similarly, calculate for all values of the data set Y.
Calculate the square of the difference for both the data sets X and Y.
Multiply the difference in X with Y.
Correlation Coefficient is calculated using the formula given below
Correlation Coefficient = Σ [(X – Xm) * (Y – Ym)] / √ [Σ (X – Xm)2 * Σ (Y – Ym)2]
Correlation Coefficient = 0.343264
So it means that both the data sets have a positive correlation and is given by 0.343264.
Correlation Coefficient Formula – Example #2
Let’s say you are looking to invest in the stock market and invest in 2 stocks and want to choose those stocks so that your portfolio is diversified. It means that if one gives you a negative return, others will help you to get a positive return and vice versa. So basically, you want to invest in stocks that have a negative correlation. You have 2 stocks and have got information on their historical returns for the last 15 years.
The correlation coefficient is calculated using the excel formula.
Correlation Coefficient = -0.45986
Here we have used the CORREL() function of excel to see the correlation coefficient for the 2 stocks. You see that the correlation function is negative in value, which means that both the stocks have a negative correlation. So your choice is apt as per your requirements.
We know and discuss that the correlation coefficient is a measure of the extent of the relation between two variables, but the catch here is that it can only measure the relationship, which is linear. This tool is not efficient in capturing nonlinear relationships. Also, there are a few other properties of the correlation coefficient:
- A correlation coefficient is a unit-less tool. This is a very useful property since it allows you to compare data that have different units. For example, Stock prices are dependent upon various parameters like inflation, interest rates, etc. So we can use public information to determine the correlation between them.
- As discussed above, its value lies between + 1 to -1. So +1 is perfectly positively correlated, and -1 is perfectly negatively correlated.
Relevance and Uses of Correlation Coefficient Formula
The correlation coefficient helps us to understand the data sets and their relationship better and has many applications in finance and economics. Financial institutes, banks, companies, and even governments use correlation coefficients to track the historical data and extract meaningful information and predict market trends in an efficient way. A correlation coefficient is a very powerful tool, but it should not be used in a silo and apply along with other tools. The reason for that is simple; we cannot simply rely on data, and data sometimes gives us unmeaning full information. For example: If you have collected information and you have got to know that there is a positive correlation between rain and the death of dogs. It means that in the year when the rain was more, there is a number of dogs who died. However, there is a correlation that is not meaningful at all. That is called a spurious correlation. So be very careful while making decisions only based on data.
This has been a guide to the Correlation Coefficient Formula. Here we discuss how to calculate the Correlation Coefficient using a formula along with practical examples and a downloadable excel template. You may also look at the following articles to learn more –