'How to extract 4 letters from file names and use in substitution in multiple files

my $v3test;
my $rootDir = "C:\\";
$v3test = "$rootDir"."test\\";

directory

chdir $v3test;
opendir(V3, $v3test);
my @str0 = readdir V3;
my $str0 = @str0;

local $^I = ''; 
local @ARGV = glob "*.rnx";

File Name: GANS???????????????.rnx, YONS???????????????.rnx, GUMC???????????????.rnx

my $str5 = "CREF0001";
my $str6 = substr(@ARGV[0], 0, 4);
**#I want to extract 4 words form file title**

while (<>) {

  s/\Q$str5/$str6/g;
  print;
}

The *.rnx data is GPS data.

I want to extract 4 words from *.rnx file title. How can I do this?


Edit: It has been confirmed in comments that it is four letters, not words. Those should be used, with four spaces appended, to replace the string $str5 in all files.



Solution 1:[1]

The following replaces CREF0001 by a string derived from the name of the file.

So in files YONS...rng the string CREF0001 is replaced by YONS (four spaces), while in all files with the name GANS...rng the replacement is GANS , etc.

With $^I set the files are edited in place. I assign ~ to it so to keep a backup, filename~. Assign an empty string '' instead if backup is unneeded but only once this has been well tested.

use warnings;
use strict;
use feature 'say';

# Assigning other than '' (empty string) keeps a backup
local $^I = '~';
local @ARGV = glob "*.rnx";

my $bad_data = 'CREF0001';

my $filename = $ARGV[0];  # initialize

# Replace $bad_data with this string
my $gps_name_string = substr($filename, 0, 4) . ' 'x4;

while (<>) { 
    if ($filename ne $ARGV) {  # a new file
        $filename = $ARGV;
        $gps_name_string = substr($filename, 0, 4) . ' 'x4;
    }

    s/$bad_data/$gps_name_string/g;

    print;
}

This uses the $ARGV variable, which has the name of the currently processed file, to detect when the loop started processing lines from the next file, so to build the suitable replacement string.


I presume that there is a reason for using local-ized @ARGV, and that is fine. I'd like to mention a couple of other options though

  • Submit the glob on the command line, as progname *.rng, and this way @ARGV gets set and then while (<>) { } in the program processes lines from those files

  • Build the file list as you do, using glob, but then process files by names, not using <>

    use warnings;
    use strict;
    use Path::Tiny;  # for path()->edit_lines
    
    my $bad_data = 'CREF0001';
    
    my @files = glob "*.rng";
    
    foreach my $file (@files) { 
        my $gps_name_string = substr($file, 0, 4) . ' 'x4;
        path($file)->edit_lines( sub { s/$bad_data/$gps_name_string/g } );
    }
    

    The edit_lines applies the anonymous sub in its argument, here with just the regex, to each line and rewrites the file. See Path::Tiny. Or one can open the files normally and iterate over lines as in the main text (except that now we know the filename).

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1