'Python Requests: take all lines from a TXT file, one at a time to get requests from each and save them to a new TXT file
This code here gets a bunch of URLS from web archive.org and save them to a new TXT file. I want instead of INPUT (write one url address), to load a bunch of URLS from a TXT file. So x=input('URL:') must be replaced with some code to load each line at a time from a txt file.
I've been trying for few days now, i'm stuck! Please help!
The code:
x=input('Enter your url:-')
r = requests.get('http://web.archive.org/cdx/search/cdx?url=*.{}&output=text&fl=original&collapse=urlkey'.format(x))
with open('url.txt', 'a') as f:
f.write('\n')
f.writelines(str(r.text))
f.write('\n')
Solution 1:[1]
To read URLs from file you can use next example:
import requests
urls = []
with open("something.txt", "r") as f_in:
for line in map(str.strip, f_in):
if line == "":
continue
urls.append(line)
archive_url = "http://web.archive.org/cdx/search/cdx?url=*.{}&output=text&fl=original&collapse=urlkey"
with open("output.txt", "w") as f_out:
for url in urls:
print(url)
r = requests.get(archive_url.format(url))
print(r.text, file=f_out)
print("\n", file=f_out)
something.txt contains domains, for example:
google.com
yahoo.com
output.txt contains response from requests
Solution 2:[2]
First, you need to have all URLs in urls.txt file each separated by new lines and then open it with the readlines() function. it will return the list of all URLs. here is the complete code.
import requests
with open('urls.txt') as file:
# get the list of urls
urls_list=file.readlines()
for x in urls_list:
r = requests.get('http://web.archive.org/cdx/search/cdx?url=*.{}&output=text&fl=original&collapse=urlkey'.format(x))
print(r.status_code)
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | |
| Solution 2 |
