'Deriving pattern for runs of letters
I have a list of string "1. AGGCHRUSHCCKSGDSKCGGHCSG" I would like to get all "G" in my string and print a pattern like this: GG-G-GG-G
and if there are no Gs in my string, it should print "No G found".
I have tried basic string, substring, and print in python, but that's all I got. I can't find an Excel formula for this either. How can I generate this pattern?
Solution 1:[1]
You can use a regular expression to replace sequences of one or more non-"G" characters with a single dash, and then use .strip() to remove any leading or trailing dashes:
import re
data = "1. AGGCHRUSHCCKSGDSKCGGHCSG"
result = re.sub(r"[^G]+", r"-", data).strip("-")
if "G" in result:
print(result)
else:
print("No G found")
This outputs:
GG-G-GG-G
Solution 2:[2]
EDIT - With @PranavHosangadi's suggestions:
from itertools import groupby
string = "1. AGGCHRUSHCCKSGDSKCGGHCSG"
groups = ("".join(group) for key, group in groupby(string) if key == "G")
print("-".join(groups))
Output:
GG-G-GG-G
>>>
Solution 3:[3]
string1= "1. AGGCHRUSHCCKSGDSKCGGHCSG"
string2=""
for char in string1:
if char=='G':
string2+=char
else:
string2+='-'
print(string2)
like this?
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | |
| Solution 2 | |
| Solution 3 | alexander |
