'TfidfVectorizer Vectorizing words for train model
I'm preparing data for text classification models and I use TfidfVectorizer for vectorizing words. I have 10k sentences and 47k vectorized words, so every sentence gets a list item of 47k vectorized words. Result I get and feed to model is like that [0.012300 0.001234 0.000000 0.000000 0.000000...][0.012300 0.001234 0.000000 0.000000 0.000000...][0.012300 0.001234 0.000000 0.000000 0.000000...] The question is: is my vector data good or it should be separated by comma like this: [0.012300, 0.001234, 0.000000, 0.000000, 0.000000...]?
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|