'Saving to image with format extension create a huge file
I work on tool for detecting objects in pdf files. One of the first steps is a conversion of pdf file to image file.
Here's my original code which reads pdf from file path, puts the content into Image / Bitmap object, extracts just a specific area of the image and saves extracted area into new file.
string pathToSave = fileName.Replace(".pdf", "");
PdfDocument doc = new PdfDocument();
doc.LoadFromFile(fileName);
Image emf = (Bitmap)doc.SaveAsImage(0, Spire.Pdf.Graphics.PdfImageType.Metafile, 1200, 1200);
Rectangle r = new Rectangle(13555, 615, 5860, 7240);
Bitmap target = new Bitmap(r.Width, r.Height);
using (Graphics g = Graphics.FromImage(target))
{
g.DrawImage(emf, new Rectangle(0, 0, target.Width, target.Height), r, GraphicsUnit.Pixel);
target.Save(pathToSave + ".bmp");
}
What I've noticed is that saving Bitmap to file with passing an extension format creates a file of much bigger size than without passing an extension format.
Saving file in this way results in file of size ~900kb
target.Save(pathToSave + ".bmp");
Saving file in this way results in file of size ~165MB.
target.Save(pathToSave + ".bmp", System.Drawing.Imaging.ImageFormat.Bmp);
Both files are of proper format and can be further used with the same results by Tesseract. Why there is so huge difference between those two ways of saving files?
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|
