'Sentiment analysis of non-English texts

I want to analyze sentiment of texts that are written in German. I found a lot of tutorials on how to do this with English, but I found none on how to apply it to different languages.

I have an idea to use the TextBlob Python library to first translate the sentences into English and then to do sentiment analysis, but I am not sure whether or not it is the best way to solve this task.

Or are there any other possible ways to solve this task?



Solution 1:[1]

Now there is a pre-trained sentiment classifier for German text. Hugging Face has released two open-source APIs as follows.

  1. oliverguhr/german-sentiment-bert
  2. bert-base-german-cased-sentiment-Germeval17

Solution 2:[2]

A lot of progress has been made for sentiment analysis in non-English languages since you asked your question 6 years ago. Today, you have very good Hugging Face Transformer based models, fine-tuned for sentiment analysis in many languages. In my opinion, the best one for German is https://huggingface.co/oliverguhr/german-sentiment-bert

If you can't or don't want to run your own model, you can also use an API like this API I developed recently: NLP Cloud. I recently added the above German model for sentiment analysis.

Non-English NLP is still far from perfect. Most datasets are in English only but the ecosystem is gradually making progress.

Solution 3:[3]

Or as an alternative to classification, you could use a sentiment lexicon of German subjective terms. It would be beneficial to read this paper [1]. The advantage of using a lexicon based model is that it doesn't require any training.

Another way to do it is to try a hybrid model which involves feeding the terms in the lexicon as features for the classifier itself, along with some manually annotated training set.

Solution 4:[4]

There's also a dedicated German TextBlob: https://textblob-de.readthedocs.io/en/latest/ (under active development here):

Example:

from textblob_de import TextBlobDE as TextBlob

doc = "Es gibt kein richtiges Leben im falschen."
blob = TextBlob(doc)
blob.sentiment
# Sentiment(polarity=-1.0, subjectivity=0.0)

As of February 2022, there (still) is no subjectivity score available, and certain features don't work (such as .translate()). However, .noun_phrases or .tags work very well.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Neelisha SAXENA
Solution 2 Julien Salinas
Solution 3 modarwish
Solution 4 MERose