'How to loop through millions of column data and do a task without it being slow?
token_pos = []
df_tags = []
content_1 = rel_data["content"]
df_tags = content_1.values
counter=0
for values in df_tags:
doc = nlp(df_tags[counter])
for token in doc:
token_pos.append(token.pos_)
d = {x:token_pos.count(x) for x in token_pos}
df = pd.DataFrame([d])
print(df)
counter+=1
This is my code, for context:
Content_1 has data that consists of paragraphs as data, sort of like long text and I want to loop through millions of these whilst also counting how many part of speech tags they have and put it in a DF. The code above works but it puts each one into a seperate DF and its also very slow. I need one that is fast but also includes them all into one massive DF
I am stuck :(
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|
