'Perl Command to find Sequence Match [closed]

Could anyone explain me how exactly this perl line is build up and how it works?

perl -lane'if ($_ =~ "sequence xy") {@arr=split("right part of sequence xy"); print $arr[1]}


Solution 1:[1]

I think works might be overselling it. It is not a very well written piece of code.

  • -l is end-of-line handling. It sets $/ and $\ to \n, and chomps lines from input. Basically a convenience to handle newlines.
  • -a is autosplit, it will split each line of input into an array @F. By default it splits on whitespace. Sadly, this is unused in this particular code and could be safely removed.
  • -n assumes a while (<>) loop around the code. This basically reads input from files or stdin, and executes the code inside for each line.

As for the code, if ($_ =~ "sequence xy"), should be written if (/sequence xy/), since the $_ is implied, and this is a common Perl idiom. Second, the binding operator =~ should be followed by a search pattern, substitution, or transliteration (m//, s///, tr//). See perldoc perlop. In this case it will just assume that the string is a regex pattern and use the match operator m//. This checks if the current line from the input matches the string sequence xy.

If the match happens, the line in split. Again, split should contain a match operator, not a string. The documentation says split /PATTERN/, EXPR, LIMIT where the 2 last are optional, and it will use $_ if it lacks EXPR. Normally split would be used like this: @arr = split /,/, $foo (this will split a string inside variable $foo on comma). This will use the sentence right part of sequence xy to split. Then it will print the second element in the resulting list. So if you had

Fooo bar baz right part of sequence xy hello world

It would print hello world.

All the lines of the input or file that do not contain sequence xy will be ignored.

This can be improved to:

perl -lne'if (/sequence xy/) { print( (split /right part of sequence xy/)[1] ) }'

You can also write this code more verbosely and put it in a file, and execute it as a command: perl foo.pl inputfile.txt, othercommand | perl foo.pl, or ./foo.pl input.txt. You would then use something like:

#!/usr/bin/perl
use strict;
use warnings;         # always use these
 
while (<>) {          # reading input from file or stdin
    if (/sequence xy/) {  # for lines matching this...
        chomp;            # remove newline
        my ($left, $right) = split /right part of sequence xy/;
                             # don't need an array to print one value
        print "$right\n";    # instead of -l print with a newline
    }
}

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1