'NAs introduced by coercion when using as.numeric
theurl <- "https://cryptoslam.io/#sales-rankings-24h"
url <- curl(theurl, "rb")
urldata <- readLines(url, warn=FALSE)
data <- readHTMLTable(urldata, stringAsFactors = FALSE)
close(url)
data.2 <- data.frame(Reduce(rbind, data[1]))
data.3 <- data.2 %>% dplyr::select(Collection, Sales, Change..24h.) %>%
head(10) %>% mutate(Sales.numeric = as.numeric(gsub('[$,]', '', Sales))) %>%
mutate(Change.numeric = as.numeric(gsub('%', '', Change..24h.)))
I have been experiencing NA coercion even though I have removed % from the column but I am still unable to change it into numeric form.
Solution 1:[1]
We may use parse_number
library(dplyr)
data.2 %>%
dplyr::select(Collection, Sales, Change..24h.) %>%
head(10) %>%
mutate(Sales.numeric = as.numeric(gsub('[$,]', '', Sales))) %>%
mutate(Change.numeric = readr::parse_number(Change..24h.))
-output
Collection Sales Change..24h. Sales.numeric Change.numeric
1 Bored Ape Yacht ClubBored Ape YC $9,241,122 33.87% 9241122 33.87
2 Mutant Ape Yacht ClubMutant Ape Yacht Club $8,068,976 27.42% 8068976 27.42
3 CryptoPunksCryptoPunks $3,067,042 70.91% 3067042 70.91
4 CloneXCloneX $2,781,643 41.75% 2781643 41.75
5 RTFKT MNLTHRTFKT MNLTH $2,478,028 29.55% 2478028 29.55
6 AzukiAzuki $2,418,388 30.29% 2418388 30.29
7 CrabadaCrabada $2,128,350 20.20% 2128350 20.20
8 Bored Ape Kennel ClubBored Ape Kennel Club $2,112,681 2.23% 2112681 2.23
9 World Of WomenWorld Of Women $1,703,430 41.22% 1703430 41.22
10 NBA Top ShotNBA Top Shot $1,695,039 73.66% 1695039 73.66
The reason is that there is a space before the number and this prevents it from converting to character
> data.2 %>%
dplyr::select(Collection, Sales, Change..24h.) %>%
head(10) %>%
mutate(Sales.numeric = as.numeric(gsub('[$,]', '', Sales))) %>%
pull(Change..24h.)
[1] " 33.87%" " 27.42%" " 70.91%" " 41.75%" " 29.55%" " 30.29%" " 20.20%" " 2.23%" " 41.22%" " 73.66%"
So, if we remove the space it should work
data.2 %>%
dplyr::select(Collection, Sales, Change..24h.) %>%
head(10) %>%
mutate(Sales.numeric = as.numeric(gsub('[$,]', '', Sales))) %>%
mutate(Change.numeric = as.numeric(gsub("[^0-9.]+", "", Change..24h.)))
-output
Collection Sales Change..24h. Sales.numeric Change.numeric
1 Bored Ape Yacht ClubBored Ape YC $9,241,122 33.87% 9241122 33.87
2 Mutant Ape Yacht ClubMutant Ape Yacht Club $8,068,976 27.42% 8068976 27.42
3 CryptoPunksCryptoPunks $3,067,042 70.91% 3067042 70.91
4 CloneXCloneX $2,781,643 41.75% 2781643 41.75
5 RTFKT MNLTHRTFKT MNLTH $2,478,028 29.55% 2478028 29.55
6 AzukiAzuki $2,418,388 30.29% 2418388 30.29
7 CrabadaCrabada $2,128,350 20.20% 2128350 20.20
8 Bored Ape Kennel ClubBored Ape Kennel Club $2,112,681 2.23% 2112681 2.23
9 World Of WomenWorld Of Women $1,703,430 41.22% 1703430 41.22
10 NBA Top ShotNBA Top Shot $1,695,039 73.66% 1695039 73.66
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 |
