'Linear Log Model in R weird Regression Line?
I have the following Dataset in R.
> eh
Country PercentUrban GDPCapita UnemRate FSI HDI AvgHeight PEG AEG
1 USA 82 59.9 3.7 38.0 0.924 177.0 1 1
2 Canada 81 46.5 5.5 20.0 0.926 175.1 1 0
3 Australia 86 49.4 5.2 19.7 0.939 175.6 1 1
4 New Zealand 87 40.7 3.9 20.1 0.917 177.0 1 1
5 UK 83 44.9 3.9 36.7 0.922 175.3 0 0
6 Ireland 63 76.7 5.3 20.6 0.938 177.5 0 0
7 Iceland 94 55.3 4.4 19.8 0.935 181.0 1 1
8 Norway 82 62.2 3.8 18.0 0.953 179.7 1 1
9 Sweden 87 51.4 7.1 20.3 0.933 181.5 0 0
10 Finland 85 46.3 5.9 16.9 0.920 180.7 0 1
11 Denmark 88 54.3 3.8 19.5 0.929 180.4 1 1
12 Germany 77 52.6 3.1 24.7 0.936 178.1 0 0
13 France 80 44.0 8.5 32.0 0.901 175.6 0 0
14 Netherlands 91 54.4 3.5 24.8 0.931 180.8 0 0
15 Belgium 98 49.4 5.5 28.6 0.916 178.6 0 0
16 Luxembourg 91 107.6 5.3 20.4 0.904 179.9 1 1
17 Austria 58 53.9 6.7 25.0 0.908 179.0 1 0
18 Switzerland 74 66.3 2.1 18.7 0.944 175.4 1 1
19 Spain 80 39.0 14.1 40.7 0.891 174.2 0 0
20 Portugal 65 32.6 6.3 25.3 0.847 173.9 0 0
21 Ukraine 69 8.7 7.8 71.0 0.751 172.4 0 0
22 Russia 74 25.8 4.5 74.7 0.816 177.2 0 0
23 Italy 70 40.9 9.5 43.8 0.880 177.3 0 0
24 Slovenia 55 36.4 7.4 28.0 0.896 180.3 1 1
25 Slovakia 54 32.3 5.0 40.5 0.855 179.4 1 0
26 Czechia 74 38.0 2.7 37.6 0.888 180.2 0 1
27 Poland 60 29.9 5.2 42.8 0.865 178.7 0 0
28 Hungary 71 28.8 3.4 49.6 0.838 177.3 1 1
29 Romania 54 26.7 3.8 47.8 0.811 171.8 0 0
30 Bulgaria 75 20.9 5.3 50.6 0.813 175.2 1 1
31 Greece 79 28.6 16.9 53.9 0.870 176.9 0 0
32 Turkey 75 28.0 13.9 80.3 0.791 173.6 0 0
33 South Korea 81 38.8 3.4 33.7 0.903 173.5 1 1
34 Japan 92 42.1 2.2 34.3 0.909 172.1 1 1
35 South Africa 66 13.5 29.0 71.1 0.699 167.8 0 0
36 Nigeria 50 5.9 23.1 98.5 0.532 167.2 0 1
37 Brazil 87 15.6 11.8 71.8 0.759 172.5 1 1
38 Argentina 92 20.8 10.6 46.0 0.825 174.1 1 0
39 Indonesia 55 12.3 5.0 70.4 0.694 158.1 1 1
40 India 34 7.2 6.0 74.4 0.640 166.3 1 1
41 China 59 16.8 3.6 71.1 0.752 169.5 1 0
42 Egypt 43 11.6 7.5 88.4 0.696 170.3 0 0
43 Colombia 81 14.5 10.8 75.7 0.747 170.6 0 1
Im trying to create a linear log model with X being GDP and Y being FSI. So far I have done
> linlogmodel_eh<-lm(formula=eh$FSI~log(eh$GDPCapita))
> summary(linlogmodel_eh)
Call:
lm(formula = eh$FSI ~ log(eh$GDPCapita))
Residuals:
Min 1Q Median 3Q Max
-16.906 -6.335 -1.334 4.805 33.343
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 151.034 8.770 17.22 < 2e-16 ***
log(eh$GDPCapita) -31.234 2.491 -12.54 1.29e-15 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 10.54 on 41 degrees of freedom
Multiple R-squared: 0.7932, Adjusted R-squared: 0.7882
F-statistic: 157.3 on 1 and 41 DF, p-value: 1.287e-15
> plot(eh$GDPCapita, eh$FSI, xlim=c(3, 152), ylim=c(15, 100))
> abline(151.034, -31.234)
Unfortunately when I do this and plot both the scatterplot and regression line, I get a oddly almost straight looking regression line. Is this the correct line for this? It seems very wrong visually.
Any advice on what I am doing wrong or what I need to fix? Im not entirely sure what is wrong here or what I would use to fix it.
Solution 1:[1]
abline doesn't know you've transformed your x-variable. You have y = a + b*log(x) so you need
curve(151.034+(-31.234)*log(x), add = TRUE)
By the way, this is an unusual "log-linear" relationship. The more usual form (which gives rise to exponential curves) is log(y) = a + b*x ? y = exp(a)*exp(b*x)
Also, as a general best practice, I would recommend
lm(formula=log(FSI) ~ GDPCapita, data = eh)
(or FSI ~ log(GDPCapita) if you really want that version); using the data= argument makes your code easier to read and makes downstream methods like predict(), etc. more convenient.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | Ben Bolker |

