'Repeat a python function on its own output

I made a function that scrapes the last 64 characters of text from a website and adds it to url1, resulting in new_url. I want to repeat the process by scraping the last 64 characters from the resulting URL (new_url) and adding it to url1 again. The goal is to repeat this until I hit a website where the last 3 characters are "END".

Here is my code so far:

#function
def getlink(url):
    url1 = 'https://www.random.computer/api.php?file='
    req=request.urlopen(url)
    link = req.read().splitlines()

    for i,line in enumerate(link):
        text = line.decode('utf-8')
    
    last64= text[-64:]
    new_url= url1+last64
  
    return new_url



getlink('https://www.random/api.php?file=abcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyz012345678910')
#output
'https://www.random/api.php?file=zyxwvutsrqponmlkjihgfedcba012345678910abcdefghijklmnopqrstuvwxyz'

My trouble is figuring out a way to be able to repeat the function on its output. Any help would be appreciated!



Solution 1:[1]

A simple loop should work. I've removed the first token as it may be sensible information. Just change the WRITE_YOUR_FIRST_TOKEN_HERE string with the code for the first link.

from urllib import request


def get_chunk(chunk, url='https://www.uchicago.computer/api.php?file='):
    with request.urlopen(url + chunk) as f:
        return f.read().decode('UTF-8').strip()


if __name__ == '__main__':
    chunk = 'WRITE_YOUR_FIRST_TOKEN_HERE'
    while chunk[-3:] != "END":
        chunk = get_chunk(chunk[-64:])
        print(chunk)
        # Chunk is a string, do whatever you want with it,
        # like chunk.splitlines() to get a list of the lines

read get the byte stream, decode turns it into a string, and strip removes leading and trailing whitespaces (like \n) so that it doesn't mess with the last 64 chars (if you get the last 64 chars but one is a \n you will only get 63 chars of the token).

Solution 2:[2]

Try the below code. It can perform what you mention above?

import requests
from bs4 import BeautifulSoup

def getlink(url):
    url1 = 'https://www.uchicago.computer/api.php?file='
    response = requests.post(url)
    doc = BeautifulSoup(response.text, 'html.parser')
    text = doc.decode('utf-8')
    last64= text[-65:-1]
    new_url= url1+last64
  
    return new_url

def caller(url):
    url = getlink(url)
    if not url[-3:]=='END':
        print(url)
        caller(url)

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1
Solution 2 chirag aggarwal