'Awk: for-loop with array of numbers

How to use an array of numbers in a for loop with awk ?

I tried:

awk '{ for (i in [10, 12, 18]) print i }' myfile.txt

But I'm getting a syntax error.

awk


Solution 1:[1]

The in operator works on arrays. The way to create an array from a list of numbers like 10 12 18 is to split() a string containing those numbers.

To have those numbers stored as values in an array a[] with indices 1 to 3:

awk 'BEGIN{FS=OFS="|"; split("10 12 18",a," ")}
     (FNR>2)  { for(j in a) { i=a[j]; k=$i OFS $(i+1); c[k]++; d[k] = i } }
     END{for (k in c) print d[k],k,c[k] }' myfile.txt

To have those numbers stored as indices of an array b[] with all values 0-or-null (same as an uninitialized scalar variable):

awk 'BEGIN{FS=OFS="|"; split("10 12 18",a," "); for (j in a) b[a[j]]}
     (FNR>2)  { for(i in b) { k=$i OFS $(i+1); c[k]++; d[k] = i } }
     END{for (k in c) print d[k],k,c[k] }' myfile.txt

If you didn't want to create the array once up front for some reason (e.g. the list of numbers you want to split is created dynamically) then you could create it every time you need it, e.g.:

awk 'BEGIN{FS=OFS="|"}
     (FNR>2)  { split("10 12 18",a," "); for(j in a) { i=a[j]; k=$i OFS $(i+1); c[k]++; d[k] = i } }
     END{for (k in c) print d[k],k,c[k] }' myfile.txt

but obviously creating the same array multiple times is less efficient than creating it once.

Solution 2:[2]

kinda made a very rough emulation of a for-loop that directly takes in a list without needing an extra function call prior to that to initialize it :

  • it tries to be as flexible as possible regarding what the delimiter(s) might be, so

    foreach("[CA=MX=JP=FR=SG=AUS=N.Z.]")
    

    would actually also work.

  • Despite being shown with the gawk profile below, and using the PROCINFO array, you don't need gawk for it to work :

    • it's functional on mawk 1.3.4, mawk 1.9.9.6, gnu gawk 5.1.1, and macos x
  • just added a Unicode UTF8 feature, which works regardless of what your locale setting is, or whether you're using gawk mawk or nawk

    • emojis work fine too
  • that said, it cannot properly parse CSV, XML, or JSON inputs (didn't have the time to make it that fancy)

             list 1 ::  10
             list 1 ::  12
             list 1 ::  18.
             list 1 ::  27
             list 1 ::  36
             list 1 ::  pi
             list 1 ::  hubble
             list 1 ::  kelvins
             list 1 ::  euler
             list 1 ::  higgs
             list 1 ::  9.6
    
             list 2 ::  CA
             list 2 ::  MX
             list 2 ::  JP
             list 2 ::  FR
             list 2 ::  SG
             list 2 ::  AUS
             list 2 ::  N.Z.
    
     # gawk profile, created Mon May  9 22:06:03 2022
    
     # BEGIN rule(s)
    
     BEGIN {
     11      while (i = foreach("[10, 12, 18., 27, 36, pi, hubble, kelvins, euler, higgs, 9.6]")) {
     11          print "list 1 :: ", i
         }
      1      printf ("\n\n\n")
      7      while (i = foreach("[CA, MX, JP, FR, SG, AUS, N.Z., ]")) {
      7          print "list 2 :: ", i
         }
     }
    
    
     # Functions, listed alphabetically
    
     20  function foreach(_, __)
         {
     20      if (_=="") {
             return \
                  PROCINFO["foreach", "_"] = \
                  PROCINFO["foreach", "#"] = _
             }
     20      __ = "\032\016\024"
    
     20      if (_!= PROCINFO["foreach", "_"]) { # 2
      2      
                 PROCINFO["foreach","_"]=_
    
      2          gsub("^[ \t]*[[<{(][ \t]*"\
                      "|[ \t]*[]>})][ \t]*$"\
                      "|\\300|\\301","",_)
    
                 gsub("[^"(\
                             "\333\222"~"[^\333\222]"\
                           ? "\\200-\\277"\
                              "\\302-\\364"\
                           : ""       \
                           )"[:alnum:]"\
                                        \   
                             "\302\200""""-\337\277" \
                          "\340\240\200-\355\237\277"  \
                           "\356\200\200-\357\277\277"   \
                        "\360\220\200\200-\364\217\277\277"\
                                                   \
                                     ".\42\47@$&%+-]+",__,_)
    
                 gsub("^"(__)"|"\
                         (__)"$","", _)
    
      2          PROCINFO["foreach","#"]=_
             }
     20      if ((_=PROCINFO["foreach","#"])=="") { # 2
      2          return _
             }
     18      sub((__) ".*$", "", _)
             sub("^[^"(__)"]+("(__)")?","",PROCINFO["foreach","#"])
     18      return _
         }
    
             list 2 ::  CA
             list 2 ::  MX
             list 2 ::  JP
             list 2 ::  FR
             list 2 ::  SG
             list 2 ::  ?
             list 2 ::  ?
             list 2 ::  N.Z.
    
      while(i = foreach("[CA=MX=JP=FR=SG=\353\210\267"\
                        "=\360\237\244\241=N.Z.]")) { 
    
          print "list 2 :: ", i
      }
    

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1
Solution 2 RARE Kpop Manifesto