'Downloading archives from a virtual machine takes too much time to start

I wrote a program in Python which downloads a .tar.gz archive from a virtual machine. The download works fine, but the problem is that it takes too much time to start. Here's the part of my code that handles the downloading:

import os
from wsgiref.util import request_uri
from wsgiref.simple_server import make_server

def download(self,dirr):
     file_url = os.environ['location'] + dirr
     headers = [('Content-Description', 'File Transfer'),
               ('Content-Type', 'application/octet-stream'),
               ('Content-Disposition', 'attachement; filename="'+os.path.basename(file_url)+'"'),
               ('Expires', '0'),
               ('Cache-Control', 'must-revalidate'),
               ('Pragma', 'public'),
               ('Content-Length', str(os.stat(file_url).st_size))]
        
      file_download = open(file_url, 'rb')
      return headers, file_download.read()

def server_app(environ, start_response):
    crt_handler = handler(request_uri(environ))
    headers, response_body = crt_handler.get() // this calls my download function, which is part of a class. 
    status = '200 OK'
    start_response(status, headers)
    return [response_body] 

def start_server():
   httpd = make_server("", PORT, server_app)
   httpd.serve_forever()

Sorry if some things don't make sense in my code, I've pasted only the part that does the downloading. The program does much more. Anyway, is it possible to make the downloads start faster?

Solution 1:^[1]

file_download.read() looks wrong. Most of the HTTP libraries support streaming from a file object.

If make_server is wsgiref.simple_server.make_server, the following might work but I can not test it. The idea is to not call read directly the file but use wsgiref.util.FileWrapper to convert it to a iterator of blocks. And then return that iterator from the app function.

def download(self,dirr):
     file_url = os.environ['location'] + dirr
     headers = [('Content-Description', 'File Transfer'),
               ('Content-Type', 'application/octet-stream'),
               ('Content-Disposition', 'attachement; filename="'+os.path.basename(file_url)+'"'),
               ('Expires', '0'),
               ('Cache-Control', 'must-revalidate'),
               ('Pragma', 'public'),
               ('Content-Length', str(os.stat(file_url).st_size))]

      file_download = open(file_url, 'rb')
      return headers, wsgiref.util.FileWrapper(file_download) # ***

def server_app(environ, start_response):
    crt_handler = handler(request_uri(environ))
    headers, response_body = crt_handler.get() // this calls my download function, which is part of a class. 
    status = '200 OK'
    start_response(status, headers)
    return response_body # ***

def start_server():
   httpd = make_server("", PORT, server_app)
   httpd.serve_forever()

Note above there are two changes. Marked by # ***

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution	Source
Solution 1

'Downloading archives from a virtual machine takes too much time to start

Solution 1:[1]

Sources

Related Questions

Solution 1:^[1]