'MapReduce with Text File - not sure how to evaluate
I have two text files, one with text and the other as a dictionary, they are referring to the same document ID, however So what I want to do is somehow make a MapReduce script that is able to create some form of analysis with it. I am not sure what however, so I am here asking for some ideas from some stats experts on what they would do. it has some parameters like Facebook likes, shares, word, the LIWC and Empath scores for that particular document which is based off of text signals. Maybe a count of the top 10 scoring Empath words and top 10 LIWC scoring scores would be helpful.
[{"doc_id":"A01","URL":"https://website.com/document-file.html","website":"website.com","seeds":"michael.jackson.death","date":"2016-12-30","subcorpus":"conspiracy","title":"Title goes here","txt":"many lines of text \r\n\r\n More text.","txt_nwords":4075,"txt_nsentences":160,"txt_nparagraphs":120,"topic_k100":"k100_24","topic_k200":"k200_75","topic_k300":"k300_192","mention_conspiracy":8,"conspiracy_representative":false,"cosine_similarity":0.1768,"FB_shares":13,"FB_comments":2,"FB_reactions":4}]
{"doc_id":"A01","LIWC_WC":4051,"LIWC_Analytic":88.89,"LIWC_Clout":76.97,"LIWC_Authentic":13.98,"LIWC_Tone":5.22,"LIWC_WPS":22.26,"LIWC_Sixltr":27.18,"Empath_weapon":0,"Empath_children":0,"Empath_monster":0,"Empath_ocean":0,"Empath_giving":0,"Empath_contentment":0,"Empath_writing":0.0297,"Empath_rural":0.0011,"Empath_positive_emotion":0.0011,"Empath_musical":0.0191}
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|
