'Linear Log Model in R weird Regression Line?

I have the following Dataset in R.

> eh
        Country PercentUrban GDPCapita UnemRate  FSI   HDI AvgHeight PEG AEG
1           USA           82      59.9      3.7 38.0 0.924     177.0   1   1
2        Canada           81      46.5      5.5 20.0 0.926     175.1   1   0
3     Australia           86      49.4      5.2 19.7 0.939     175.6   1   1
4   New Zealand           87      40.7      3.9 20.1 0.917     177.0   1   1
5            UK           83      44.9      3.9 36.7 0.922     175.3   0   0
6       Ireland           63      76.7      5.3 20.6 0.938     177.5   0   0
7       Iceland           94      55.3      4.4 19.8 0.935     181.0   1   1
8        Norway           82      62.2      3.8 18.0 0.953     179.7   1   1
9        Sweden           87      51.4      7.1 20.3 0.933     181.5   0   0
10      Finland           85      46.3      5.9 16.9 0.920     180.7   0   1
11      Denmark           88      54.3      3.8 19.5 0.929     180.4   1   1
12      Germany           77      52.6      3.1 24.7 0.936     178.1   0   0
13       France           80      44.0      8.5 32.0 0.901     175.6   0   0
14  Netherlands           91      54.4      3.5 24.8 0.931     180.8   0   0
15      Belgium           98      49.4      5.5 28.6 0.916     178.6   0   0
16   Luxembourg           91     107.6      5.3 20.4 0.904     179.9   1   1
17      Austria           58      53.9      6.7 25.0 0.908     179.0   1   0
18  Switzerland           74      66.3      2.1 18.7 0.944     175.4   1   1
19        Spain           80      39.0     14.1 40.7 0.891     174.2   0   0
20     Portugal           65      32.6      6.3 25.3 0.847     173.9   0   0
21      Ukraine           69       8.7      7.8 71.0 0.751     172.4   0   0
22       Russia           74      25.8      4.5 74.7 0.816     177.2   0   0
23        Italy           70      40.9      9.5 43.8 0.880     177.3   0   0
24     Slovenia           55      36.4      7.4 28.0 0.896     180.3   1   1
25     Slovakia           54      32.3      5.0 40.5 0.855     179.4   1   0
26      Czechia           74      38.0      2.7 37.6 0.888     180.2   0   1
27       Poland           60      29.9      5.2 42.8 0.865     178.7   0   0
28      Hungary           71      28.8      3.4 49.6 0.838     177.3   1   1
29      Romania           54      26.7      3.8 47.8 0.811     171.8   0   0
30     Bulgaria           75      20.9      5.3 50.6 0.813     175.2   1   1
31       Greece           79      28.6     16.9 53.9 0.870     176.9   0   0
32       Turkey           75      28.0     13.9 80.3 0.791     173.6   0   0
33  South Korea           81      38.8      3.4 33.7 0.903     173.5   1   1
34        Japan           92      42.1      2.2 34.3 0.909     172.1   1   1
35 South Africa           66      13.5     29.0 71.1 0.699     167.8   0   0
36      Nigeria           50       5.9     23.1 98.5 0.532     167.2   0   1
37       Brazil           87      15.6     11.8 71.8 0.759     172.5   1   1
38    Argentina           92      20.8     10.6 46.0 0.825     174.1   1   0
39    Indonesia           55      12.3      5.0 70.4 0.694     158.1   1   1
40        India           34       7.2      6.0 74.4 0.640     166.3   1   1
41        China           59      16.8      3.6 71.1 0.752     169.5   1   0
42        Egypt           43      11.6      7.5 88.4 0.696     170.3   0   0
43     Colombia           81      14.5     10.8 75.7 0.747     170.6   0   1

Im trying to create a linear log model with X being GDP and Y being FSI. So far I have done

> linlogmodel_eh<-lm(formula=eh$FSI~log(eh$GDPCapita))
> summary(linlogmodel_eh)

Call:
lm(formula = eh$FSI ~ log(eh$GDPCapita))

Residuals:
    Min      1Q  Median      3Q     Max 
-16.906  -6.335  -1.334   4.805  33.343 

Coefficients:
                  Estimate Std. Error t value Pr(>|t|)    
(Intercept)        151.034      8.770   17.22  < 2e-16 ***
log(eh$GDPCapita)  -31.234      2.491  -12.54 1.29e-15 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 10.54 on 41 degrees of freedom
Multiple R-squared:  0.7932,    Adjusted R-squared:  0.7882 
F-statistic: 157.3 on 1 and 41 DF,  p-value: 1.287e-15

> plot(eh$GDPCapita, eh$FSI, xlim=c(3, 152), ylim=c(15, 100))
> abline(151.034, -31.234)

Unfortunately when I do this and plot both the scatterplot and regression line, I get a oddly almost straight looking regression line. Is this the correct line for this? It seems very wrong visually.

enter image description here

Any advice on what I am doing wrong or what I need to fix? Im not entirely sure what is wrong here or what I would use to fix it.



Solution 1:[1]

abline doesn't know you've transformed your x-variable. You have y = a + b*log(x) so you need

curve(151.034+(-31.234)*log(x), add = TRUE)

By the way, this is an unusual "log-linear" relationship. The more usual form (which gives rise to exponential curves) is log(y) = a + b*x ? y = exp(a)*exp(b*x)

Also, as a general best practice, I would recommend

lm(formula=log(FSI) ~ GDPCapita, data = eh)

(or FSI ~ log(GDPCapita) if you really want that version); using the data= argument makes your code easier to read and makes downstream methods like predict(), etc. more convenient.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Ben Bolker