'Extract numbers and/or strings from a python string using regular expression
I have a string s that looks like the following:
s = "{[9 -9 '\\\\N' 28 '-2' '0.000' '\\\\N' '1.0000']\n]}"
How can obtain a list l that extracts the numbers and \\\\N so that the list looks like the following:
l = [9, -9, '\\\\N', 28, -2, 0.000, '\\\\N', 1.0000]
I tried to use re.findall('[-]?\d+[.]?[\d]*', s) but it only extracts the numbers. How should I modify my regular expression to include \\\\N?
Solution 1:[1]
You are almost correct, you can modify your pattern to: r"-?\d+(?:\.\d+)?|\\\\N"
Test regex here: https://regex101.com/r/U3uEyQ/1
Python code:
s = "{[9 -9 '\\\\N' 28 '-2' '0.000' '\\\\N' '1.0000']\n]}"
re. findall(r"-?\d+(?:\.\d+)?|\\\\N", s)
# ['9', '-9', '\\\\N', '28', '-2', '0.000', '\\\\N', '1.0000']
Solution 2:[2]
Alternative solution without regex:
s = """{[9 -9 '\\\\N' 28 '-2' '0.000' '\\\\N' '1.0000']\n]}"""
def remove_chars(text):
for ch in ['{', '}', '[', ']', '\n', '\'']:
text = text.replace(ch, "")
return text
result = remove_chars(s).split()
print(result)
Prints
['9', '-9', '\\\\N', '28', '-2', '0.000', '\\\\N', '1.0000']
if you like to convert string digits to int and float extend the code with the following:
# convert str to int and float if possible
def is_digit(item):
if item.isdigit():
return int(item)
else:
try:
float(item)
return float(item)
except ValueError:
return item
result_types = [is_digit(x) for x in result]
print(result_types)
Prints
[9, -9.0, '\\\\N', 28, -2.0, 0.0, '\\\\N', 1.0]
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | anotherGatsby |
| Solution 2 |
