'How does an operating system or program actually verify the type of a file?

I'm curious--mainly for how operating systems implement security around this--how an OS verifies that if I text you an image, the file is actually an image. Or if you open a file that ends in .pdf that the file is actually an encoded pdf. I've found that each file has a header that declares the type of the file, but what stops me from writing in the header that my file is a jpeg and then encoding a bunch of malicious code. How do OS's decide whether a file is safe to open and is actually the type of file it says it is?



Solution 1:[1]

Even today, operating-systems don't verify any of this. On Windows, when you launch an executable, the system can do some verification like where it comes from, if it knows about the origin or scan it to check for malware. Your computer is completely vulnerable to user mode attacks.

It is often a misconception that the kernel-user separation is providing security on modern operating-systems. From user mode, one can access the whole filesystem (except some privileged portion of it). Most people will store their files in the documents folder or somewhere that is not privileged. In consumer computers, there is really no point to have a kernel-user separation. Maybe it can help avoid some kind of deep malware that will install itself in the root of the operating-system an can avoid some damages but mostly it doesn't prevent a malware from spying on you or encrypting your data for ransom.

In itself, there have been known issues for files like pdfs but these are not related to the check up that the operating-system is making. These are related to the software that is used to interpret those pdfs like Adobe or Firefox. Files like pdfs can embed some javascript that is executed within the sandbox provided by Adobe. This sandbox can be escaped if there are vulnerabilities in the software. This won't allow a zero day exploit (where the malware has access to the kernel) but it can allow to escape the sandbox to access the full user mode environment that is riddled with bugs which the software that escaped the sandbox can then exploit to eventually end up in the kernel.

When you launch a pdf file, the path to the pdf is passed to the main function of the executable that is specified to be the default to launch that type of file. It is really that simple. If the default is adobe, when you double click a pdf, the path to the pdf is passed to the main function of the adobe executable which is itself responsible to provide security. The operating-system has nothing to do with this.

In itself, a file that isn't an executable doesn't pose much threat because it will not be executed. The content will be interpreted in a sandbox but not executed. The software's sandbox can have bugs and possible exploits but mostly these are fixed today and quite hard to find by yourself.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 user123