'How to decrease the time taken by html-pdf for report generation?
I'm dealing with high volume data. I want to generate a PDF report which contains 1 year data. I'm using html-pdf to do that. The problem is that it is taking around 20 minutes to generate the PDF. Can I decrease it?
I don't want to make my user wait for 20 minutes.
Note:I want to generate PDF on server side only and I want to mail that to user.
I tried for some solutions,but I didn't fond any. But what I observed is that when I'm downloading some file from google, the workflow is very smooth.
I want to know how can I make my workflow smooth
app.post('/api', (req, res) => {
var json = req.body;
var sqlGet = 'SELECT StationId,TimeStamp,data->>"$.temperature" as temperature,data->"$.humidity" as humidity,data->"$.so2" as so2,data->"$.no2" as no2,data->"$.co" as co,data->"$.o3" as o3,data->"$.co2" as co2,data->"$.pm10" as pm10,data->"$.noise" as noise,data->"$.rain" as rain,data->"$.wdir" as wdir,data->"$.wspeed" as wspeed,data->"$.press" as press FROM TwentyStationFiveYearData WHERE StationId = ' + json.stationid + ' AND TimeStamp >= ' + json.start + ' AND TimeStamp <= ' + json.end;
console.log('' + sqlGet);
connection.query(sqlGet, function(err, result) {
if (err) {
console.log('error' + err);
} else {
var htmlCode = generateTable(result, 15);
//var output = fs.createWriteStream('reports/' + json.start + '-' + json.end + '.pdf');
pdf.create(htmlCode, {
orientation: 'landscape',
type: 'pdf',
timeout: '1800000'
}).toStream(function(err, stream) {
if (err) {
console.log('stream did not work ' + err);
} else {
stream.pipe(fs.createWriteStream('output.pdf'));
console.log('pdf generated');
res.send("report will be mailed");
}
});
}
});
});
function generateTable(jsonData, columns) {
var tab = '';
var headers = ['StationId', 'TimeStamp', 'temperature', 'humidity', 'so2', 'no2', 'co', 'o3', 'co2', 'pm10', 'noise', 'rain', 'wdir', 'wspeed', 'press'];
tab += '<tr>';
for (var c = 0; c < columns; c++) {
tab += '<th>';
tab += headers[c];
tab += '</th>';
}
tab += '<tr>';
jsonData.forEach(obj => {
tab += '<tr>';
for (var c = 0; c < columns; c++) {
tab += '<td>';
tab += obj[headers[c]];
tab += '</td>';
}
tab += '</tr>';
});
tab = '<table border = "1">' + tab + '</table>';
return html(tab);
}
function html(table) {
var htmlCode = '<!DOCTYPE html><html lang="en"><head><meta charset="utf-8"><title>Title</title></head><body>' + table + '</body></html>';
return htmlCode;
}
I want to make my workflow smoother and to decrease the time taken for PDF report generation
Solution 1:[1]
Have you narrowed down the time-consuming operations? is it the pdf rendering the most time-consuming?
What I would do is check the raw query to see how much time it takes to be executed.
Secondly, I would check the size of the result and if it can be compressed somehow and finally, I would check the PDF rendering.
I have used dompdf for PHP and there was a performance issue when I was rendering png instead of jpg but I suppose this is not the case as I don't see any image rendering.
So if flexible to use another library you better do it or else check for performance tuning in the one you use.
I suppose the font rendering can take a lot of time.
Hope I gave you some hints!
Solution 2:[2]
npm's html-pdf uses phantomjs . When you use it, it makes a displayless web browser instance in a subprocess, then uses it to render the html data you give it, then uses the same instance to print it to a pdf file. That works. But as you have discovered, it's slow when you give it a large html object.
Still, 20 minutes is a long time. Is it possible your machine is RAM-constrained? Is the phantomjs subprocess running out of RAM and thrashing?
Maybe pdf-puppeteer is faster. It's worth a try, and it's more future-proof than relying on the now-deprecated phantomjs.
You might try PDFKit. It generates pdf files directly from within your node program. That will be faster because it needs no subprocess. But, you must rework your report program so it uses PDFKit's API rather than a browser instance.
There are multiple software-as-a-service solutions available that work like html-pdf but faster. You might consider using one of those, especially if quick delivery of pdf files is critical to your business.
Finally, even though you don't want to do it, you can write your HTML and CSS so it does a really good job formatting for printing. And ask your user to print.
Solution 3:[3]
pdf.create(htmlCode, { orientation: 'landscape', type: 'pdf', timeout: '1800000'
You have set the timeout as 1800000 milliseconds, so its is taking long time to generate pdf. So
var options = { format: "A3", orientation: "landscape", border: "10mm", timeout: "3000" };
var document = {
type: 'buffer',
template: html,
context: {
Wellsfargo: wellsfargo
},
};
if (document === null) {
return null;
} else {
pdf.create(document, options).then(response => {
res.writeHead(200, {
"Content-Disposition": "attachment;filename=" + "wellsFargo.pdf",
'Content-Type': 'application/pdf'
});
return res.end(response);
})
.catch(error => {
console.error(error);
return res.status(200).type('application/pdf').send(error.toString());
});
};
});
try these timeout or remove the timeout..
because phanthomjs default timeout is 3000milliseconds.
use dynamic-html-pdf package it is very fast..
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | Konstantinos A |
| Solution 2 | |
| Solution 3 |
