OdinSchool OdinSchool

Correlation

line-below

 
If you jog your memory a bit to the part where we discussed the Coefficient of variation, you will find a similarity here.

The Correlation coefficient is calculated by simply dividing the Covariance by the product of the standard deviation of each variable.

Here is the formula. 

 Correlation
 

Let's find the standard deviation for the temperature and ice cream sales. Here is the data table again for your reference.

Day Max temperature (°C) Number of ice cream units sold
Sunday 23 50
Monday 25 70
Tuesday 21 35
Wednesday 23 45
Thursday 26 66
Friday 28 80
Saturday 26 58

The standard deviation for each of these variables works out to - 2.4 (temperature) and 15.6 (sales).

Let's multiply these two to get - 36.89

To calculate the Coefficient of correlation, we divide the Covariance 29.59 by this product.

We get 0.8 when we perform this division.

The interpretation for the Coefficient of correlation is:

A correlation of 1 indicates a perfect positive correlation, and -1 indicates a perfect negative correlation, meaning the entire variability is explained by the other variable. Any values closer to 1 or -1 show the strength of the relation.

We have already seen an example of a positive correlation in our ice cream sales example. Can you think of an example of a negative correlation?

As economies do well (indicated by a growth in GDP numbers), gold prices go down.

A correlation of 0 shows that the variables are independent. Of course, when you have a zero Covariance, it is obvious, and you don't need to calculate the coefficient of correlation further.

Causality has a direction; correlation doesn't.

Clearly change in temperature causes changes in ice cream sales. It's not the other way around. At the same time, the correlation has no such direction.

Correlation is not Causation.

It is important to understand that just because two variables are related, one is causing the change in the other.

For example, when you see some data that shows a correlation between two unrelated variables, like fuel prices and rainfall in a region, you don't attribute causation unless of course, there is a practical explanation establishing the connection between the two.