'Is such type conversion of modified `$1` from `strnum` to `string` a bug or a feature in GNU Awk?

$ awk --version
GNU Awk 5.0.1, API: 2.0 (GNU MPFR 4.0.2, GNU MP 6.2.0)
Copyright (C) 1989, 1991-2019 Free Software Foundation.

I run following three similar commands which tries to use $1 and $2 as integers. During that I use sub() in Awk to strip a non-numerical heading character @.

However if sub() operates particularly on $1 instead of the whole $0, the result doesn't get converted to integer afterwards.

Then if sub() doesn't find matches in $1 the conversion goes also fine:

$ echo @101 9 | awk '{sub(/^@/, "", $0); print "("$2" < "$1") is " ($2 < $1)}'
(9 < 101) is 1

$ echo @101 9 | awk '{sub(/^@/, "", $1); print "("$2" < "$1") is " ($2 < $1)}'
(9 < 101) is 0

$ echo  101 9 | awk '{sub(/^@/, "", $1); print "("$2" < "$1") is " ($2 < $1)}'
(9 < 101) is 1

Hence I am not sure about whether is this a bug or the expected behavior. If it's expected, I would like to find out the reason behind that.

I expected the 2nd case to generate result equal to the one from the 1st or the 3rd case.


Update 1:

I added type dumping:

$ cat dump-args.awk

function dump(text) {

    printf text
    printf ", $0 is "typeof($0)
    printf ", $1 is "typeof($1)
    printf ", $2 is "typeof($2)
    print ""
}

$ echo @101 9 | awk '@include "dump-args.awk"; { dump("Initially"); sub(/^@/, "", $0); dump("After sub"); print "("$1" > "$2") is " ($1 > $2)}'
Initially, $0 is string, $1 is string, $2 is strnum
After sub, $0 is string, $1 is strnum, $2 is strnum
(101 > 9) is 1

$ echo @101 9 | awk '@include "dump-args.awk"; { dump("Initially"); sub(/^@/, "", $1); dump("After sub"); print "("$1" > "$2") is " ($1 > $2)}'
Initially, $0 is string, $1 is string, $2 is strnum
After sub, $0 is string, $1 is string, $2 is strnum
(101 > 9) is 0

$ echo  101 9 | awk '@include "dump-args.awk"; { dump("Initially"); sub(/^@/, "", $1); dump("After sub"); print "("$1" > "$2") is " ($1 > $2)}'
Initially, $0 is string, $1 is strnum, $2 is strnum
After sub, $0 is string, $1 is strnum, $2 is strnum
(101 > 9) is 1

Thanks to some comments and this info, it is now more clear when the type of $1 may change and when it get fixed. But...


Update 2:

Most explanations doesn't highlight the following difference which I just found during reduction of the test case:

$ echo @101 9 | awk '{ sub(/^@/, "", $1); print ($1 > $2)}'
0

$ echo  @91 9 | awk '{ sub(/^@/, "", $1); print ($1 > $2)}'
1

The types are just the same as with the @101:

$ echo  @91 9 | awk '@include "dump-args.awk"; { dump("Initially"); sub(/^@/, "", $1); dump("After sub"); print "("$1" > "$2") is " ($1 > $2)}'
Initially, $0 is string, $1 is string, $2 is strnum
After sub, $0 is string, $1 is string, $2 is strnum
(91 > 9) is 1


Solution 1:[1]

This behavior is a feature, for example

echo 20 101 9 | awk '{sub(/20/, "", $0); print $1}'

print

101

Because awk recompile the record when $0 is changed, for example

echo 20 101 9 | awk '{sub(/20/, "", $1); print $1}'

Print nothing, because $1 is delete and $1 contains an empty string, this does not recompile the record, in your example $1 is cast as a text or an integer

echo @101 9 | awk '{sub(/^@/, "", $1); print typeof($1)}'
echo @101 9 | awk '{sub(/^@/, "", $0); print typeof($1)}'
echo @101 9 | awk '{sub(/^@/, "", $1); $0=$0; print typeof($1)}'

in the last line $0=$0 recompile the record, this print,

string
strnum
strnum

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1