'Filtering strings based on replicated numerical substrings at specific places in the string (in R)

I have a list of file names/paths and I want to filter out the ones where the filename begins with the same six digits that are found after the first "/" in the path. So for example, in the below list, numbers [1], [2], and [6] would be retained, whereas numbers [3], [4], and [5] would be removed from the new list. I'm imagining it should be possible to split each string at the "/"s and compare the first six digits of the 2nd split with the last split, but I'm not sure how to implement this. Any suggestions would be appreciated.

tail(processed_ARL_list)
[1] "220204/220204 2022-02-04 09-32-30/ARL2200660.D/ARL2200660.pdf"                                                     
[2] "220204/220204 2022-02-04 09-32-30/ARL2200661.D/ARL2200661.pdf"                                                     
[3] "220204/220204 2022-02-04 09-32-30/REFTTO220204_.D/220204 2022-02-04 09-32-30_REFTTO220204_.pdf"                    
[4] "220207/220204 2022-02-07 12-51-02/REFTTO220207_.D/220204 2022-02-07 12-51-02_REFTTO220207_.pdf"                    
[5] "220207/220204 2022-02-07 12-51-02/SREF0186 METHYL EUGENOL.D/220204 2022-02-07 12-51-02_SREF0186 METHYL EUGENOL.pdf"
[6] "220207/220204 2022-02-07 12-51-02/SREF0186 METHYL EUGENOL.D/SREF0186 METHYL EUGENOL.pdf" 


Solution 1:[1]

So I got the result I was after using this looping method. I feel like there might be a better way, but this will do for now.

    processed_results<-c()
    for (i in c(1:length(ARL_list))){
     filepath_split<-str_split(ARL_list[i],pattern="/")
     if(substr(unlist(filepath_split)[2],1,6)!=substr(unlist(filepath_split)[length(unlist(filepath_split))],1,6)){
       processed_results[i]<-TRUE
     }  else {
         processed_results[i]<-FALSE
     }
    }
    processed_ARL_list<-ARL_list[processed_results]

Output

tail(processed_ARL_list)
[1] "220128/220128 2022-01-28 07-53-13/ARL2200536.D/ARL2200536.pdf"                          
[2] "220128/220128 2022-01-28 07-53-13/ARL2200537.D/ARL2200537.pdf"                          
[3] "220128/220131 2022-01-31 16-10-36/REFTTO220131_.D/REFTTO220131_.pdf"                    
[4] "220204/220204 2022-02-04 09-32-30/ARL2200660.D/ARL2200660.pdf"                          
[5] "220204/220204 2022-02-04 09-32-30/ARL2200661.D/ARL2200661.pdf"                          
[6] "220207/220204 2022-02-07 12-51-02/SREF0186 METHYL EUGENOL.D/SREF0186 METHYL EUGENOL.pdf"

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 benson23