'Singular value decomposition (SVD) to tfidf dataframe using mapReduce
Here is my dataframe, i used np.linalg.svd but it doesnt use MapReduce so i want a similar function to np.linalg.svd that uses mapreduce
#there is 5 documents.
print(tfidf)
a portion of the output:
[{'poorly': 0.03095072908527116, 'respect': 0.0, 'got': 0.0, 'interpretation': 0.0, 'pretty': 0.0, 'regular': 0.03095072908527116, 'issues': 0.03095072908527116, 'glad': 0.0, 'lunar': 0.06190145817054232, 'one': 0.0, 'complex': 0.0, 'rockets': 0.0, 'might': 0.0, 'possible': 0.03095072908527116, 'ritual': 0.0, 'luck': 0.0, 'quite': 0.03095072908527116, 'crash': 0.03095072908527116, 'play': 0.0, 'least': 0.0, 'contest': 0.0, 'fighters': 0.0, 'corps': 0.0, 'result': 0.0, 'low': 0.017620975612964523, 'would': 0.0, 'flying': 0.0, 'missions': 0.03095072908527116...}]
then applying np.linalg.svd
X = pd.DataFrame(tfidf).T
output:
0 1 2 3 4
poorly 0.030951 0.000000 0.000000 0.000000 0.000000
respect 0.000000 0.000000 0.025547 0.000000 0.000000
got 0.000000 0.050905 0.000000 0.000000 0.029558
interpretation 0.000000 0.000000 0.000000 0.000000 0.051917
pretty 0.000000 0.000000 0.000000 0.000000 0.051917
... ... ... ... ... ...
g 0.000000 0.000000 0.051093 0.000000 0.000000
due 0.030951 0.000000 0.000000 0.000000 0.000000
development 0.000000 0.000000 0.051093 0.000000 0.000000
...
now my goal is to do the same process but with a function that uses mapreduce
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|
