'How to decode PDF binary data to Postscript readable text
I'm currently searching for a way to extract the code (PostScript) behind a PDF just from its binary data. I have an input of type file that has an onChange event handler attached. The event handler looks like this:
const handleFileChange = (e) => {
const [file] = e.target.files
const reader = new FileReader()
reader.addEventListener("loadend", () => {
const buffer = reader.result
const view = new DataView(buffer)
const decoder = new TextDecoder("ascii")
const text = decoder.decode(view)
})
reader.readAsArrayBuffer(file)
}
This code produces text that looks like this %PDF-1.3 %Äåòåë§ó ÐÄÆ 3 0 obj << /Filter /FlateDecode /Length 4573 >> stream xµ\ëŽÜ¶þϧ жVwi‚ hìM
The beginning is just fine, it's everything afterward is the issue - %Äåòåë§ó ÐÄÆ & µ\ëŽÜ¶þϧ жVwi‚ hìM. It's illegible. I'm not sure if I just have the wrong encoding selected or what I'm setting out to do is all but impossible. Am I missing something here? If yes, what exactly?
I appreciate any and all help. Thank you!
(resources are also welcomed)
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|
