'NaN in cor() metric

I have data from compustat on different economic metrics for US companies as well as lobbying expenditure for these companies. I merge these datasets with left_join() function.

I have created two additional variables: "lobby_revt_pct" (percentage of lobbying expenditure of revenue) and "lobby_gp_pct" (percentage of lobbying expenditure of gross profit). Revenue and gross profit are in millions. I checked these variables manually, and it checks out (i.e. the lobby expenditure of e.g. revenue in % is correct).

I wish to see these variables in a correlation matrix. The other variables work, but the two aforementioned variables return NaN. Why? And is there a fix?

join_df <- left_join(compustat, lobby_expenditure, by = c("gvkey", "year"))

#new variable: lobbying expenditure as a percentage share
#of respective year revenue

join_df$lobby_revt_pct <- (join_df$expend/(join_df$revt*1000000))*100

join_df$lobby_gp_pct <- (join_df$expend/(join_df$gp*1000000))*100

#cor
cor(join_df, use = "pairwise.complete.obs"[sapply(join_df, is.numeric)])

Output:


                      gvkey         year        naics          at           gp
gvkey           1.000000000  0.145972467 -0.016623718 -0.05087810 -0.085729359
year            0.145972467  1.000000000 -0.027012382  0.03681725  0.051685347
naics          -0.016623718 -0.027012382  1.000000000  0.04463169  0.004457598
at             -0.050878102  0.036817255  0.044631687  1.00000000  0.594233227
gp             -0.085729359  0.051685347  0.004457598  0.59423323  1.000000000
lct            -0.056661336  0.051043277 -0.019868730  0.88155209  0.771837888
revt           -0.084994470  0.048538321 -0.010151030  0.43787682  0.827172011
tie            -0.115622243  0.003035108 -0.002584713  0.76935365  0.753452440
txpd            0.002665236  0.006650114 -0.001791597  0.02586675  0.049040643
xdp                      NA           NA           NA          NA           NA
xsga           -0.098379359  0.047337986  0.006583750  0.43630634  0.920886086
expend         -0.098455101  0.080394812  0.033842575  0.33230920  0.496191500
lobby_revt_pct          NaN          NaN          NaN         NaN          NaN
lobby_gp_pct            NaN          NaN          NaN         NaN          NaN
                       lct        revt          tie         txpd xdp        xsga
gvkey          -0.05666134 -0.08499447 -0.115622243  0.002665236  NA -0.09837936
year            0.05104328  0.04853832  0.003035108  0.006650114  NA  0.04733799
naics          -0.01986873 -0.01015103 -0.002584713 -0.001791597  NA  0.00658375
at              0.88155209  0.43787682  0.769353646  0.025866750  NA  0.43630634
gp              0.77183789  0.82717201  0.753452440  0.049040643  NA  0.92088609
lct             1.00000000  0.81128522           NA  0.053310015  NA  0.75583507
revt            0.81128522  1.00000000  0.891790266  0.057496412  NA  0.77383781
tie                     NA  0.89179027  1.000000000  0.586165935  NA  0.67626876
txpd            0.05331001  0.05749641  0.586165935  1.000000000  NA  0.02382979
xdp                     NA          NA           NA           NA  NA          NA
xsga            0.75583507  0.77383781  0.676268756  0.023829788  NA  1.00000000
expend          0.35901711  0.38344403  0.199782257  0.244935543  NA  0.46833965
lobby_revt_pct         NaN         NaN -0.006623683          NaN  NA         NaN
lobby_gp_pct           NaN         NaN -0.010564012          NaN  NA         NaN
                    expend lobby_revt_pct lobby_gp_pct
gvkey          -0.09845510            NaN          NaN
year            0.08039481            NaN          NaN
naics           0.03384257            NaN          NaN
at              0.33230920            NaN          NaN
gp              0.49619150            NaN          NaN
lct             0.35901711            NaN          NaN
revt            0.38344403            NaN          NaN
tie             0.19978226   -0.006623683  -0.01056401
txpd            0.24493554            NaN          NaN
xdp                     NA             NA           NA
xsga            0.46833965            NaN          NaN
expend          1.00000000            NaN          NaN
lobby_revt_pct         NaN            NaN          NaN
lobby_gp_pct           NaN            NaN          NaN
Warning message:
In if (is.na(na.method)) stop("invalid 'use' argument") :
  the condition has length > 1 and only the first element will be used

edit: @quinten : I used dput(join_df). Not sure if I did it correctly? I removed the first part due to space limitation. Most of it is NA anyway.

, row.names = c(NA, -214620L
), spec = structure(list(cols = list(gvkey = structure(list(), class = c("collector_number", 
"collector")), year = structure(list(), class = c("collector_number", 
"collector")), naics = structure(list(), class = c("collector_number", 
"collector")), at = structure(list(), class = c("collector_number", 
"collector")), gp = structure(list(), class = c("collector_number", 
"collector")), lct = structure(list(), class = c("collector_number", 
"collector")), revt = structure(list(), class = c("collector_number", 
"collector")), tie = structure(list(), class = c("collector_number", 
"collector")), txpd = structure(list(), class = c("collector_number", 
"collector")), xdp = structure(list(), class = c("collector_number", 
"collector")), xsga = structure(list(), class = c("collector_number", 
"collector"))), default = structure(list(), class = c("collector_guess", 
"collector")), delim = ","), class = "col_spec"), problems = <pointer: 0x559516ae34c0>, class = c("spec_tbl_df", 
"tbl_df", "tbl", "data.frame"))

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution	Source

'NaN in cor() metric

Sources

Related Questions