'Runtime Error when running a LDA model of gensim on Windows
Okay, so i know this is a Windows error and not a Gensim error. Based on previous examples on the internet and other comments/solutions from Stack Overflow I came up with the code below. However, the code never makes it to the print of the coherence score. The details are: Windows 10, Visual Code, Python 3.8.13.
My question is any idea how to fix this or what I am doing wrong?
from multiprocessing import Process, freeze_support
import re
from sklearn.datasets import fetch_20newsgroups
from gensim.models.coherencemodel import CoherenceModel
from gensim.corpora.dictionary import Dictionary
def main():
print("start main")
texts, _ = fetch_20newsgroups( subset='all', remove=('headers', 'footers', 'quotes'), return_X_y=True )
tokenizer = lambda s: re.findall( '\w+', s.lower() )
texts = [ tokenizer(t) for t in texts ]
# Creating some random topics
topics = [ ['space', 'planet', 'mars', 'galaxy'],
['cold', 'medicine', 'doctor', 'health', 'water'],
['cats', 'health', 'keyboard', 'car', 'banana'],
['windows', 'mac', 'computer', 'operating', 'system']
]
# Creating a dictionary with the vocabulary
word2id = Dictionary( texts )
# Coherence model
cm = CoherenceModel(topics=topics, texts=texts, coherence='c_v', dictionary=word2id)
coherence_per_topic = cm.get_coherence()
print("coherence", coherence_per_topic)
if __name__ == '__main__':
freeze_support()
Process(target=main).start()
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|
