'Using sed command in shell script for substring and replace position to need

I’m dealing data on text file and I can’t find a way with sed to select a substring at a fixed position and replace it.

This is what I have:

X|001200000000000000000098765432|1234567890|TQ

This is what I need:

‘X’,’00000098765432’,’1234567890’,’TQ’

The following code in sed gives the substring I need (00000098765432) but not overwrites position to need

echo “ X|001200000000000000000098765432|1234567890|TQ” | sed “s/
*//g;s/|/‘,’/g;s/^/‘/;s/$/‘/“

Could you help me?



Solution 1:[1]

If you want to put the quotes in, I'd still use awk.

$: awk -F'|' 'BEGIN{q="\047"} {print  q $1 q","q substr($2,17,14) q","q $3 q","q $4 q"\n"}' <<< "X|001200000000000000000098765432|1234567890|TQ"
'X','00000098765432','1234567890','TQ'

If you just want to use sed, note that you say above you want to remove 16 characters, but you are actually only removing 14.

$: sed -E "s/^(.)[|].{14}([^|]+)[|]([^|]+)[|]([^|]+)/'\1','\2','\3','\4'/" <<< "X|0012000000000000000098765432|1234567890|TQ"
'X','00000098765432','1234567890','TQ'

Solution 2:[2]

Rather than sed, I would use awk for this.

echo "X|001200000000000000000098765432|1234567890|TQ" | awk 'BEGIN {FS="|";OFS=","} {print $1,substr($2,17,14),$3,$4}'

Gives output:

X,00000098765432,1234567890,TQ

Here is how it works:

FS = Field separator (in the input)

OFS = Output field separator (the way you want output to be delimited)

BEGIN -> think of it as the place where configurations are set. It runs only one time. So you are saying you want output to be comma delimited and input is pipe delimited.

substr($2,17,14) -> Take $2 (i.e. second field - awk begins counting from 1 - and then apply substring on it. 17 means the beginning character position and 14 means the number of characters from that position onwards)

In my opinion, this is much more readable and maintainable than sed version you have.

Solution 3:[3]

Using sed

$ sed "s/|\(0[0-9]\{15\}\)\?/','/g;s/^\|$/'/g" input_file
'X','00000098765432','1234567890','TQ'

Solution 4:[4]

Using any POSIX awk:

$ echo 'X|001200000000000000000098765432|1234567890|TQ' |
awk -F'|' -v OFS="','" -v q="'" '{sub(/.{16}/,"",$2); print q $0 q}'
'X','00000098765432','1234567890','TQ'

Solution 5:[5]

not as elegant as I hoped for, but it gets the job done :

'X','00000098765432','1234567890','TQ'

    # gawk profile, created Mon May  9 21:19:17 2022
    # BEGIN rule(s)

    'BEGIN {
     1     _ = sprintf("%*s", (__ = +2)^++__+--__*++__,__--)
     1            gsub(".", "[0-9]", _)
     1             sub("$",     "$", _)
     1    FS = "[|]"
     1   OFS = "\47,\47"
    }

    # Rule(s)

     1     (NF *= NF == __*__) * sub(_,  "|&",   $__) * \
        sub("^.*[|]", "", $__) * sub(".+", "\47&\47")    }'

Tested and confirmed working on gnu gawk 5.1.1, mawk 1.3.4, mawk 1.9.9.6, and macosx nawk

The 4Chan Teller

Solution 6:[6]

awk -v del1="\047" \
    -v del2="," \
    -v start="3" \
    -v len="17" \
    '{
         gsub(substr($0,start+1,len),"");
         gsub(/[\|]/,del1 del2 del1);
         print del1$0del1
    }' input_file

'X',00000098765432','1234567890','TQ'

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1
Solution 2
Solution 3 HatLess
Solution 4 Ed Morton
Solution 5 RARE Kpop Manifesto
Solution 6