'plotting rows frequency of a column against the rows in r

I have a file with 2 column index and data where index is of integer type and data is of string type (character type in R).index are unique value where as data has many duplicates rows. the file has more than 2 million rows so I can't see each unique value in the console by printing unique value. how can I get the frequency of each unique rows and plot it against the unique rows itself.

r


Solution 1:[1]

Assuming something like:

df <- data.frame(
    index = 1:200,
    data = sample(letters, 200, replace = TRUE)
)

Using the table() function you can easily get the frequencies:

> table(df$data)
 a  b  c  d  e  f  g  h  i  j  k  l  m  n  o  p  q  r  s  t  u  v 
 9  9  8 14  9  8  8  4  6  9  5 12  2 12  5  5  5  6  6  5 11 11 
 w  x  y  z 
 4  8  9 10

Another alternative using tidyverse to use dplyr and ggplot2 functions:

library(tidyverse)
freq <- df %>% 
  group_by(data) %>% 
  tally()

ggplot(freq, aes(x = n, y = data)) +
  geom_col()

That will produce something like:

barplot of frequencies

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 jmcastagnetto