'PHPMyAdmin - Pearson's Correlation Coefficient
I need to translate the below formula into SQL using PHPMyAdmin for table 'Asia' containing two columns - SurfaceArea and Population.
[Pearson's Correlation Coefficient formula]

I got the below query but I get errors, is anyone able to let me know what is wrong in it, please?
SELECT (SUM(SurfaceArea - AVG(SurfaceArea)) * (Population - AVG(Population)) / STD ((SUM((SurfaceArea - AVG(SurfaceArea)) * (SurfaceArea - AVG(SurfaceArea)))) * (SUM((Population - AVG(Population))*(Population - AVG(Population)))) AS correlation_coefficient_population FROM Asia;
Many thanks
Solution 1:[1]
Your query will not work because a change was made in version 5.7-ish where it will now, by default, reject queries in which you aggregate using a function (sum, avg, max, etc.) in the SELECT clause and fail to put the non-aggregated fields in the GROUP BY clause. This behavior is part and parcel to every other RDBMS and MySQL is finally jumping on board.
However there is another way to get Pearson's coefficient. More on that here
Try this query:
SELECT (Avg(`Population` * `SurfaceArea`) - (Avg(`Population`) * Avg(`SurfaceArea`))) /
(STDDEV_POP(`Population`) * STD(`SurfaceArea`)) as PearsonCoefficient FROM `Asia`
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | zallura |
