'What is the best way to check many files changed like git?

I make a program, which control many video files.

When program start, program read all video in disk. It read file path, size and metadata.

but this is too slow. I use so many files, and read metadata is slow. (i use mutagen.mp4)

My goal is find only changed file, like git.

So, i want save info in databases like SQLite, read just changed files. but i can't know how to do it.

more than 99% of files are not changed. some files can be moved to another directory, changed filename, change metadata (artist name, genre, tag, ...). also sometimes file can removed or added.

This is my first idea.

create metatable.

Dirpath filename modified time
C:\ ... \dir1 a.mp4 2022-02-04 12:23:59
C:\ ... \dir2 b.mp4 2022-02-02 11:11:11

then if modified time changed, reload info.

i think use query like this.

select * from metatable
where dirpath == "file_dir_path" 
and filename == "file_name"

if query return 0 rows, add data. return 1 rows, check modified time and update row if not same.

and add one more table.

Dirpath filecount
C:\ ... \dir1 3
C:\ ... \dir2 1

if filecount in dir are changed, reload all file and drop row from table.

but i think this is so complex. i need to call so many queries. plus, i can't idea for change file in dir if same count.

ex) [a.mp4, b.mp4] -> [c.mp4, d.mp4] on my algorithm, will be this. [a.mp4, b.mp4, c.mp4, d.mp4]

I'm so confused...

please help me!



Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source