'Can I replace Affx- with rs in text file in bash

I have a huge text file. I want to replace all strings that start with Affx- and then some numbers (like Affx-74537382 or Affx-4374575) with rs (and the same numbers like: rs74537382 or rs4374575. Is this possible with sed -i 's/Affx-/rs/ ?

Since the file is so huge I am not sure how to verify that the command is working correctly.



Solution 1:[1]

You can use

sed -E 's/^Affx(-[0-9]+)/rs\1/' file > tmp && mv tmp file

Details:

  • -E - POSIX ERE syntax enabled
  • ^ - start of string
  • Affx - a literal text
  • (-[0-9]+) - Group 1 (\1 refers to the value in this group): - and one or more digits.

See the online demo:

#!/bin/bash
s='Blah-1233455
Affx-74537382
Some line here
Affx-4374575
End of text 123456778.'

sed -E 's/^Affx(-[0-9]+)/rs\1/' <<< "$s"

Output:

Blah-1233455
rs-74537382
Some line here
rs-4374575
End of text 123456778.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Wiktor Stribiżew