'Struggling to create a box plot, histogram, and qqplot in R [closed]

I am a very new R user, and I am trying to use R to create a box plot for prices at target vs at Walmart. I also want to create 2 histograms for the prices at each store as well as qqplots. I keep getting various errors, including "Error in hist.default(mydata) : 'x' must be numeric:" and boxplot(mydata) "Error in x[floor(d)] + x[ceiling(d)] : non-numeric argument to binary operator" . I have correctly uploaded my csv file and I will attach my data for clarity. I have also added a direct c & p of some of my code. I have tried using hist(mydata), boxplot(mydata), and qqplot(mydata) as well, all which have returned with the x is not numeric error. I'm sorry if any of this is dumb, I am extremely new to R not to mention extremely bad at it. Thank you all for your help!

#[Workspace loaded from ~/.RData]
mydata <- read.csv(file.choose(), header = T) names(mydata)
#Error: unexpected symbol in "  mydata <- read.csv(file.choose(), header = T) names"
mydata <- read.csv(file.choose(), header = T)
names(mydata)
#[1] "Product" "Walmart" "Target" 
mydata
                                                   Product
1  Sara lee artesano bread
2  Store brand dozen large eggs
3  Store brand 2% milk 1 gallon (128 fl oz)
4   12.4 oz cheez its
5   Ritz cracker fresh stacks 8ct, 11.8 oz
6  Sabra classic hummus 10 oz
7   Oreo chocolate sandwich cookies 14.3 oz
8   Motts applesauce 6 ct/4oz cups
9   Bananas (each)
10  Hass Avocado (each)
11  Chips ahoy original family size, 18.2 oz
12  Lays potato chips party size, 13 oz
13  Amy’s frozen mexican casserole, 9.5 oz
14  Jack’s frozen pizza original thin crust, 13.8 oz
15 Store brand sweet cream unsalted butter, 4 count, 16 oz
16 Sour cream and onion pringles, 5.5 oz
17 Philadelphia original cream cheese spread, 8 oz
18 Daisy sour cream, regular, 16 oz: 
19 Kraft singles, 24 ct/16 oz: 
20 Doritos nacho cheese, party size, 14.5 oz
21 Tyson Fun Chicken nuggets, 1.81 lb (29 oz), frozen
22 Kraft mac n cheese original, 7.25 oz
23 appleapple gogo squeeze, 12ct, 3.2 oz each 
24 Yoplait original french vanilla yogurt, 6oz
25 Essentia bottled water, 1 liter
26 Premium oyster crackers, 9oz
27 Aunt Jemima buttermilk pancake miz, 32 oz
28 Eggo frozen homestyle waffles, 10ct/12.3 oz
29  Kellogg's Froot Loops, 10.1 oz
30 Tostitos scoops tortilla chips, 10 oz
   Walmart Target
1     2.98   2.99
2     1.93   1.99
3     2.92   2.99
4     3.14   3.19
5     3.28   3.29
6     3.68   3.69
7     3.48   3.39
8     2.26   2.29
9     0.17   0.25
10    1.18   1.19
11    3.98   4.49
12    4.48   4.79
13    4.58   4.59
14    3.42   3.59
15    3.18   2.99
16    1.78   1.79
17    3.24   3.39
18    1.94   2.29
19    4.18   4.39
20    4.48   4.79
21    6.42   6.69
22    1.00   0.99
23    5.98   6.49
24    0.56   0.69
25    1.88   1.99
26    3.12   2.99
27    2.64   2.79
28    2.63   2.69
29    2.98   2.99
30    3.48   3.99
hist(mydata)
#Error in hist.default(mydata) : 'x' must be numeric
x<-sample(LETTERS[1:5],20,replace=TRUE)
df<-data.frame(x)
df
   x
1  E
2  B
3  A
4  B
5  E
6  B
7  A
8  A
9  C
10 E
11 A
12 B
13 A
14 B
15 C
16 D
17 C
18 E
19 A
20 D
x<-sample(LETTERS[1:5],20,replace=TRUE)
df<-data.frame(x)
hist(df$x)
#Error in hist.default(df$x) : 'x' must be numeric
x<-sample(LETTERS[1:5],20,replace=TRUE)
df<-data.frame(x)
barplot(table(df$x))
boxplot(mydata)
#Error in x[floor(d)] + x[ceiling(d)] :
#   non-numeric argument to binary operator
qqplot("Walmart")
#Error in sort(y) : argument "y" is missing, with no default
qqplot(mydata)
#Error in `[.data.frame`(x, order(x, na.last = na.last, decreasing = decreasing)) : 
#  undefined columns selected
#In addition: Warning message:
#In xtfrm.data.frame(x) : cannot xtfrm data frames

Image of data

r


Solution 1:[1]

There seems to be a problem with the data you uploaded but no matter...I will just create data resembling your problem and show you how to do it with some simple code (some may offer alternatives like ggplot, but I think my example will use shorter code and be more intuitive.)

First, we can load ggpubr for plotting functions:

# Load ggpubr for plotting functions:
library(ggpubr)

Then we can create a new data frame, first with the prices and store names, then combining them into a data frame we can use:

# Create price values and store values:
prices.1 <- c(1,2,3,4,5,3)
prices.2 <- c(8,6,4,2,0,1)
store <- c("walmart",
       "walmart",
       "walmart",
       "target",
       "target",
       "target")

# Create dataframe for these values:
store.data <- data.frame(prices.1,
                 prices.2,
                 store)

Now we can just plug in our data into all of these plots nearly the same way each time. the first part of the code is the plot function name, the data part is our stored data, and the x and y values are what we use for our variables:

# Scatterplot:
ggscatter(data = store.data,
          x="prices.1",
          y="prices.2")

enter image description here

# Boxplot:
ggboxplot(data = store.data,
          x="store",
          y="prices.1")

enter image description here

# Histogram:
gghistogram(data = store.data,
            x="prices.1")

enter image description here

# QQ Plot:
ggqqplot(data = store.data,
         x="prices.1")

enter image description here

There are simpler alternatives like base R functions like this, but I find they are much harder to customize compared to ggpubr and ggplot:

plot(x,y)

enter image description here

Of course, you can really customize the ggpubr and ggplot output to look much better, but thats up to you and what you want to learn:

ggboxplot(data = store.data,
          x="store",
          y="prices.1",
          fill = "store",
          title = "Prices of Merchandise by Store",
          caption = "*Data obtained from Stack Overflow",
          palette = "jco",
          legend = "none",
          xlab ="Store Name",
          ylab = "Prices of Merchandise",
          ggtheme = theme_pubclean())

enter image description here

Hope thats helpful. Let me know if you have questions!

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1