'Python - Mrjob inlink count / MapReduce
I would like to count the number of times that a word appears in a line of my doc with MrJob and Mapreduce.
I just succeed to count the number of time a song appear in my doc but I don't know how to count word inlink.
%%file wordcountX.py
from mrjob.job import MRJob # import the mrjob library
class WordCount(MRJob):
#create a mapper()
def mapper(self, _, line):
for word in line.strip().split(' '):
if len(word) > 0:
yield (word, 1)
#create a reducer()
def reducer(self, word, count):
yield (word, sum(count))
if __name__ == "__main__":
WordCount.run()
! python wordcountX.py songplays.txt
Do you know how to do that ?
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|
