'Optimal way to aproach the logic

I have 2 array list of files (consider large number of files in it (1k -5k))

This is created on fly when new files are added.

addedfiles=['temp.java', 'TEMP.java', 'DENT.java', 'Seal.java']
note: This files like temp.java and TEMP.java are same, added in case-sensitive way and are duplicates.

These files are all-ready present in system

ExistingFiles=['dent.java', 'temp1.java','comp.java']
note: They are distinct and unique from each other.

I am trying come with optimal logic to add distinct unique files from addedfiles to ExistingFiles.

So, in above only Seal.java file will be added in ExistingFiles as it is distinctly unqiue in addedfiles.

My logic:

    1. create a hashmap from addedfiles like [name:count]
        {temp.java:2, DENT.java:1,Seal.java:1}
    2. creating nonduplicate array =[DENT.java,Seal.java]
    3. comparing ExistingFiles and nonduplicate array using sort and binarysearch 
       if value is >=0 add value from nonduplicate to ExistingFiles.

Is there any better way to do this by using union or intersection or threads thanks:)



Solution 1:[1]

Assuming you don't need to retain the case of the filenames, I would store lower-cased existing files' name in a Set, says existingFiles, and do it as follows

Set<String> newFiles = new HashSet<>();
Set<String> dupFiles = new HashSet<>();
for (String filename : addedFiles) {
    filename = filename.toLowerCase();
    if (existingFiles.contains(filename)) { continue; }
    if (newFiles.contains(filename)) { 
        dupFiles.add(filename); 
    } else {
        newFiles.add(filename);
    }
}
newFiles.removeAll(dupFiles);
existingFiles.addAll(newFiles);

This solution is a little memory-heavy, but if speed is critical, it works well.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 bui