'Need help plotting the count of two variables in a scatterplot and then fitting the line in R
I need help with all these questions, but specifically plotting the scatterplot and fitting the linear regression model.
- Filter out any zip code where the number of emergency visits was less than 20
- Plot the Count of influenza-like illness and/or pneumonia visits against Count of all emergency department visits
- Plot the line of best fit (linear regression) and the R-squared
- From the some.zips data set, aggregate the mean of ED visits by zip code.
Here is my code, but it is not working. I keep getting "Warning in abline(m) : only using the first two of 135 regression coefficients". Can someone help? Code below. Also, here is the dataset :
fromJSON("https://data.cityofnewyork.us/resource/2nwg-uqyg.json")
library(jsonlite)
library(tidyverse)
library(ALSM)
data(package="ALSM")
filtered_data = filter(er, emergency.visits > 20)
plot(ili_pne_visits~total_ed_visits,data=filtered_data,xlab="Total ER Visits",ylab="Influenza Visits")
m <-lm(ili_pne_visits~total_ed_visits,data=filtered_data)
abline(m)
Solution 1:[1]
code-wise, this will do the job:
df <- fromJSON("https://data.cityofnewyork.us/resource/2nwg-uqyg.json")
df %>%
## convert variables from character to numeric where appropriate:
mutate(across(mod_zcta:ili_pne_admissions, ~ as.integer(.x))) %>%
filter(total_ed_visits > 20) %>%
ggplot(aes(x = total_ed_visits, y = ili_pne_admissions)) +
geom_point() +
## add regression line and confidence band
geom_smooth(method = 'lm')
However, pouring the data indiscriminately into one scatterplot/linear model hides interesting patterns - e.g. seasonality. Plotting the share of ili_pne to total visits against time, voila!
library(lubridate) ## for easy date-time-manipulation
df %>%
## convert variables from character to numeric where appropriate:
mutate(
across(mod_zcta:ili_pne_admissions, ~ as.integer(.x)),
date = lubridate::as_datetime(date),
ili_pne_share = ili_pne_visits / total_ed_visits
) %>%
filter(total_ed_visits > 20) %>%
arrange(date) %>%
ggplot(aes(x = date, y = ili_pne_share)) +
geom_line() +
geom_smooth(span = .1)
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | I_O |
