'How does os figure if a dll is already loaded in memory or how does os figure two dll are the same?

In my comprehension, "dll/so" can be shared between programs(processes). for example when "libprint.so" is loaded in "main"(called main1) at first time, "libprint.so" is loaded from disk into memory, if we start another "main"(called main2), "libprint.so" will not loaded from disk but mapped from memory because "libprint.so" has been already loaded in memory once.

So i design an experiment:

main.cc --> main

#include <iostream>
#include <chrono>
#include <thread>
void printinfo();

int main() {
printinfo();
while(true) {
  std::this_thread::sleep_for(std::chrono::milliseconds(1000));
}
return 0;
} 

print1.cc --> libprint1.so

#include <iostream>
void printinfo() {
std::cout << "Print One" << std::endl;
}

print2.cc --> libprint2.so

#include <iostream>

void printinfo() {
std::cout << "Print Two" << std::endl;
}
mv libprint1.so libprint.so
./main
// output is: Print One

keep the main.exe running, and replace the libprint.dll with libprint2.dll, like

mv libprint2.so libprint.so
./main
// output is: Print Two

why the output is "Print Two"? I expect it to be "Print One" the "libprint.so" is already loaded in memory, although i changed the content of "libprint.so", but the so's absolute path is the same as before, how does the operating system know the "new libprint.so" is different with before?

Thanks for @Michael Chourdakis, in windows environment, the libprint.dll could not be replaced when main.exe is running.

But the problem is still there in linux(libprint.so could be replaced in linux), @user253751 says there must be some tricks that linux figure different "so", i want to know exactly what the tricks are, do i have to read the linux os source code ?



Solution 1:[1]

i want to know exactly what the tricks are, do i have to read the linux os source code ?

What actually happens when you run main (which is presumably linked against libprint.so) (or at least "explainlikeimfive" version):

  1. At the time the static link is performed (g++ main.cpp ./libprint.so or similar), the static linker records that (a) your program is dynamically linked and (b) that it requires ./libprint.so to run. You can see this by looking at the output from readelf -d main. Another thing that is recorded is the dynamic loader (aka ELF interpreter), which you can see with readelf -l main.
  2. When the Linux kernel starts the new process to run main in, it discovers (by reading program headers) which ELF interpreter is to be used, mmaps that interpreter into memory, and transfers control to it. It is the job of the interpreter to mmap other libraries (including libprint.so), relocate them, arrange for proper symbol resolution, etc. etc.
  3. The ELF interpreter uses open and mmap system calls to actually bring libprint.so into the process (this happens long before main() is called.
  4. When any process calls open("/some/path", ...), the OS (the Linux kernel) performs directory lookups to map the given path to a particular mounted filesystem and a unique inode on that that filesystem. Finally it (the OS) checks to see if that inode has already been opened by any other currently running process. If the file is open, then the kernel can reuse some of the in-kernel-memory structures it already allocated for that file <-- this is where the kernel discovers that it may not have to allocate new memory pages for libprint.so but can re-use the ones it already has.

When you do mv libprint2.so libprint.so, you change the inode number of libprint.so, so the kernel knows it's a totally different file. You can observe this by using ls -li libprint.so before and after mv.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Employed Russian