'index out of range in self -> Applying pre-trained model on pandas dataframe
I'm trying to apply sentiment analysis into a pandas dataframe with tweets. And i'm getting the error
IndexError: index out of range in self.
Sample dataset: https://drive.google.com/file/d/14GuN3krdNhGDQCLShn3I6FJG-b5Zt02Z/view?usp=sharing
How im trying:
import pandas as pd
from tqdm import tqdm
from transformers import pipeline
tqdm.pandas()
sample = pd.read_csv('sample.csv')
model_name = 'finiteautomata/bertweet-base-sentiment-analysis'
classifier = pipeline('sentiment-analysis', model=model_name)
def sentiment_analysis(row):
r = classifier(row.text)[0]
return [r['label'], r['score']]
df_sample.progress_apply(sentiment_analysis, axis=1)
Some tweets returns me the error: IndexError: index out of range in self. I'm not really sure why.
This time happened on tweet 78.
Solution 1:[1]
you need to truncate the sentences with truncating = True
def sentiment_analysis(row):
r = classifier(row.text,truncation=True)[0]
return [r['label'], r['score']]
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | noob |

