'Printing PDF with selenium on Heroku (Django)

I'm really struggling with an issue. I spent the evening trying to solve it, made multiple researches but I can't find a hint to solve it.

My project : print a specific webpage as PDF, using selenium and chrome webdriver, on a heroku app via django. The idea is that the user (me) get a pdf file just after he fill a form on a website with a link of it's choice. My printing script works perfectly locally (my pdf is downloaded with the right parameters), my heroku app works well (I managed to retrieve the link from my Django form).

Here is my index method, linked to the form I use in my template. Once the form is filled, it calls the function (MyPdfSaver()) that is supposed to print the pdf.

def index(request):
submitbutton = request.POST.get("submit")

study_url = ''
test_value = ''
form = UserForm(request.POST or None)
if form.is_valid():
    study_url = form.cleaned_data.get("study_url")
    test_value = MyPdfSaver(study_url) #calling the function that prints as PDF. 

context = {
    'form': form,
    'study_url': study_url, #the link I want to print a PDF from
    'submitbutton': submitbutton,
    'test_value' : str(test_value) #just a test value 
}

return render(request, 'index.html', context)

Here is my MyPdfSaver function (the idea would be to upload the pdf directly via this function and return a simple link to the pdf) :

def MyPdfSaver(study_url):

# set webdriver and set chrome_options/settings
chrome_options = webdriver.ChromeOptions()
settings = {
    "recentDestinations": [{
        "id": "Save as PDF",
        "origin": "local",
        "account": "",
    }],
    "selectedDestinationId":
    "Save as PDF",
    "version":
    2,
    "isCssBackgroundEnabled":
    True
}
prefs = {
    'printing.print_preview_sticky_settings.appState':
    json.dumps(settings)
}
chrome_options.add_experimental_option('prefs', prefs)
chrome_options.add_argument('--kiosk-printing')
chrome_options.add_argument('--headless')
chrome_options.add_argument("--disable-dev-shm-usage")
chrome_options.add_argument("--no-sandbox")

# set chrome bin location & driver (change paths to run locally)
chrome_options.binary_location = os.environ.get("GOOGLE_CHROME_BIN")
driver = webdriver.Chrome(
    executable_path=os.environ.get("CHROMEDRIVER_PATH"),
    chrome_options=chrome_options)
driver.get(study_url)

# necessary for webpage to fully load
time.sleep(3)

html_source = driver.page_source
driver.execute_script('return window.print();') #is supposed to run the pdf printing

return str(type(html_source)) #just a try to see if it returns my driver well.

So! Everything almost works fine : the link is sent to my MyPdfSaver() function, MyPdfSaver() run webdriver without any error and returns the driver content (for example the html code like in my code). But this line of code doesn't seem to work :

driver.execute_script('return window.print();') #I also tried without the "return"

While it is supposed to download the pdf "in" my Heroku app, I can't find the file anywhere... I tried heroku run bash and run a ls -R but it runs a new dyno so it doesn't help me.

So the main question : is my driver.execute_script('return window.print();') supposed to work ? And if it's the case, where do the downloaded file goes ?

Don't hesitate if you need further informations !

Thaaank you :-)

PS : I saw that Selenium 4 provides a driver.print() function, but I didn't find any Python doc to use it properly and I think the problem will be the same : I don't know where my files go in my heroku app when selenium downloads it.



Solution 1:[1]

Create a new view function,

from django.http import FileResponse, Http404
    
def get_pdf(request, filename):
      try:
        return FileResponse(open('path_to_pdf/%s.pdf' % (filename), 'rb'), content_type='application/pdf')
      except FileNotFoundError:
        raise Http404()

And define a URL path,

path('pdf/<str:filename>', get_pdf)

Return the above file link to user.

Example: app.com/pdf/file.pdf

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1