'Using sed command in shell script for substring and replace position to need
I’m dealing data on text file and I can’t find a way with sed to select a substring at a fixed position and replace it.
This is what I have:
X|001200000000000000000098765432|1234567890|TQ
This is what I need:
‘X’,’00000098765432’,’1234567890’,’TQ’
The following code in sed gives the substring I need (00000098765432) but not overwrites position to need
echo “ X|001200000000000000000098765432|1234567890|TQ” | sed “s/
*//g;s/|/‘,’/g;s/^/‘/;s/$/‘/“
Could you help me?
Solution 1:[1]
If you want to put the quotes in, I'd still use awk.
$: awk -F'|' 'BEGIN{q="\047"} {print q $1 q","q substr($2,17,14) q","q $3 q","q $4 q"\n"}' <<< "X|001200000000000000000098765432|1234567890|TQ"
'X','00000098765432','1234567890','TQ'
If you just want to use sed, note that you say above you want to remove 16 characters, but you are actually only removing 14.
$: sed -E "s/^(.)[|].{14}([^|]+)[|]([^|]+)[|]([^|]+)/'\1','\2','\3','\4'/" <<< "X|0012000000000000000098765432|1234567890|TQ"
'X','00000098765432','1234567890','TQ'
Solution 2:[2]
Rather than sed, I would use awk for this.
echo "X|001200000000000000000098765432|1234567890|TQ" | awk 'BEGIN {FS="|";OFS=","} {print $1,substr($2,17,14),$3,$4}'
Gives output:
X,00000098765432,1234567890,TQ
Here is how it works:
FS = Field separator (in the input)
OFS = Output field separator (the way you want output to be delimited)
BEGIN -> think of it as the place where configurations are set. It runs only one time. So you are saying you want output to be comma delimited and input is pipe delimited.
substr($2,17,14) -> Take $2 (i.e. second field - awk begins counting from 1 - and then apply substring on it. 17 means the beginning character position and 14 means the number of characters from that position onwards)
In my opinion, this is much more readable and maintainable than sed version you have.
Solution 3:[3]
Using sed
$ sed "s/|\(0[0-9]\{15\}\)\?/','/g;s/^\|$/'/g" input_file
'X','00000098765432','1234567890','TQ'
Solution 4:[4]
Using any POSIX awk:
$ echo 'X|001200000000000000000098765432|1234567890|TQ' |
awk -F'|' -v OFS="','" -v q="'" '{sub(/.{16}/,"",$2); print q $0 q}'
'X','00000098765432','1234567890','TQ'
Solution 5:[5]
not as elegant as I hoped for, but it gets the job done :
'X','00000098765432','1234567890','TQ'
# gawk profile, created Mon May 9 21:19:17 2022
# BEGIN rule(s)
'BEGIN {
1 _ = sprintf("%*s", (__ = +2)^++__+--__*++__,__--)
1 gsub(".", "[0-9]", _)
1 sub("$", "$", _)
1 FS = "[|]"
1 OFS = "\47,\47"
}
# Rule(s)
1 (NF *= NF == __*__) * sub(_, "|&", $__) * \
sub("^.*[|]", "", $__) * sub(".+", "\47&\47") }'
Tested and confirmed working on gnu gawk 5.1.1, mawk 1.3.4, mawk 1.9.9.6, and macosx nawk
— The 4Chan Teller
Solution 6:[6]
awk -v del1="\047" \
-v del2="," \
-v start="3" \
-v len="17" \
'{
gsub(substr($0,start+1,len),"");
gsub(/[\|]/,del1 del2 del1);
print del1$0del1
}' input_file
'X',00000098765432','1234567890','TQ'
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | |
| Solution 2 | |
| Solution 3 | HatLess |
| Solution 4 | Ed Morton |
| Solution 5 | RARE Kpop Manifesto |
| Solution 6 |
