'Extract numbers and/or strings from a python string using regular expression

I have a string s that looks like the following:

s = "{[9 -9 '\\\\N' 28 '-2' '0.000' '\\\\N' '1.0000']\n]}"

How can obtain a list l that extracts the numbers and \\\\N so that the list looks like the following:

l = [9, -9, '\\\\N', 28, -2, 0.000, '\\\\N', 1.0000]

I tried to use re.findall('[-]?\d+[.]?[\d]*', s) but it only extracts the numbers. How should I modify my regular expression to include \\\\N?



Solution 1:[1]

You are almost correct, you can modify your pattern to: r"-?\d+(?:\.\d+)?|\\\\N"

Test regex here: https://regex101.com/r/U3uEyQ/1

Python code:

s = "{[9 -9 '\\\\N' 28 '-2' '0.000' '\\\\N' '1.0000']\n]}"
re. findall(r"-?\d+(?:\.\d+)?|\\\\N", s)
# ['9', '-9', '\\\\N', '28', '-2', '0.000', '\\\\N', '1.0000']

Test python here

Solution 2:[2]

Alternative solution without regex:

s = """{[9 -9 '\\\\N' 28 '-2' '0.000' '\\\\N' '1.0000']\n]}"""

def remove_chars(text):
    for ch in ['{', '}', '[', ']', '\n', '\'']:
        text = text.replace(ch, "")
    return text

result = remove_chars(s).split()

print(result)

Prints

['9', '-9', '\\\\N', '28', '-2', '0.000', '\\\\N', '1.0000']

if you like to convert string digits to int and float extend the code with the following:

# convert str to int and float if possible
def is_digit(item):
    if item.isdigit():
        return int(item)
    else:
        try:
            float(item)
            return float(item)
        except ValueError:
            return item

result_types = [is_digit(x) for x in result]

print(result_types)

Prints

[9, -9.0, '\\\\N', 28, -2.0, 0.0, '\\\\N', 1.0]

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 anotherGatsby
Solution 2