'How do I download and extract a list of papers in LaTeX format from arXiv?

I have a list of papers that I'd like to extract from arXiv (I have the arxiv links / name of the arxvi file), but in the LaTeX format. How can I do this in Python?

If we go to this page: https://arxiv.org/format/2010.11645

We can read the following text:

Source: Delivered as a gzipped tar (.tar.gz) file if there are multiple files, otherwise as a PDF file, or a gzipped TeX, DVI, PostScript or HTML (.gz, .dvi.gz, .ps.gz or .html.gz) file depending on submission format. [ Download source ]

We can download the file by clicking on [ Download source ], but I have no idea what type of file I'm getting back. The filename is simple 2010.11645.

I'd like to download the file in LaTeX format (which I believe it .tex) and then convert it into .txt using pandoc. I believe I'd need to download the files via requests somehow?

How can I do this? Thanks!



Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source