'Running sentiment analysis for google news headlines faced error while using udpipe

Here is my code so far

pacman::p_load(dplyr, ggplot2, stringr, udpipe, lattice)
gnewsheadlines <- read.csv(file.choose(), stringsAsFactors = F)

udmodel_english <- udpipe_load_model(file = "C:/Users/Palam/Documents/english-ewt-ud-2.5-191206.udpipe")

Step 2 – count the number of total headlines by date and plot the results to examine

headlinegoogle <- gnewsheadlines %>% filter(date >= "3/31/2022 ", date <= "4/3/2022")

s <- udpipe_annotate(udmodel_english,headlinegoogle$headline)
x <- data.frame(s)

This is the error i got while running the udpipe_annotate:

Error in `[.data.table`(out, , `:=`(c("token_id", "token", "lemma", "upos",  :
Supplied 10 columns to be assigned an empty list (which may be an empty data.table or data.frame since they are lists too). To delete multiple columns use NULL instead. To add multiple empty list columns, use list(list()).

In addition: Warning message:

In strsplit(x$conllu, "\\n", fixed = TRUE) : input string 1 is invalid UTF-8


Solution 1:[1]

Looks like headlinegoogle$headline is not in UTF-8 encoding. See https://cran.r-project.org/web/packages/udpipe/vignettes/udpipe-tryitout.html

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 user13818093