'Value error trying to fit a logistic regression with SentenceTransformer output (embeddig)

My code:

model = SentenceTransformer('hiiamsid/sentence_similarity_spanish_es')

I apply the model to the text column of the data frame

prueba['encoder'] = prueba.texto.apply(lambda x: model.encode(x))

Then I Fit a logistic regression with the encoder column and the label column.

clf = LogisticRegression(random_state=0).fit(prueba.encoder, prueba.label)

And I got this error:

ValueError: setting an array element with a sequence.



Solution 1:[1]

If everything you've done is like what I'm guessing, then the only problem with them would be providing LogisticRegression with pandas data series instead of a convenient list or a NumPy array. So the code must be changed this way:

import pandas as pd
from sklearn.linear_model import LogisticRegression
from sentence_transformers import SentenceTransformer
prueba = pd.DataFrame({'texto': ['Spanish foo bar', 'Spanish Bar Foo'], 'label':[0, 1]})
model = SentenceTransformer('hiiamsid/sentence_similarity_spanish_es')
prueba['encoder'] = prueba.texto.apply(lambda x: model.encode(x))
clf = LogisticRegression(random_state=0).fit(prueba.encoder.to_list(), prueba.label.to_list())

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 meti