'how can I scrap data from reddit (in bash)
I want to scrap titles and date from http://www.reddit.com/r/movies.json in bash
wget -q -O - "http://www.reddit.com/r/movies.json" | grep -Po '(?<="title": ").*?(?=",)' | sed 's/\"/"/'
I have titles but I don't know how to add dates, can someone help?
Solution 1:[1]
wget -q -O - "http://www.reddit.com/r/movies.json" | grep -Po '(?<="title": ").*?(?=",)' | sed 's/"/"/'
As extension suggest it is JSON (application/json) file, therefore grep and sed are poorly suited for working with it, as they are mainly for using regular expressions. If you are allowed to install tools, jq should be handy here. Try using your system package manager to install it, if it succeed you should get pretty printed version of movies.json by doing
wget -q -O - "http://www.reddit.com/r/movies.json" | jq
and then find where interesting values are placed which should allow you to grab it. See jq Cheat Sheet for example of jq usage. If you are limited to already installed tools I suggest taking look at json module of python.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | Daweo |
