'Fastest way to split a concatenated string into a tuple and ignore empty strings
I have a concatenated string like this:
my_str = 'str1;str2;str3;'
and I would like to apply split function to it and then convert the resulted list to a tuple, and get rid of any empty string resulted from the split (notice the last ';' in the end)
So far, I am doing this:
tuple(filter(None, my_str.split(';')))
Is there any more efficient (in terms of speed and space) way to do it?
Solution 1:[1]
That is a very reasonable way to do it. Some alternatives:
foo.strip(";").split(";")(if there won't be any empty slices inside the string)[ x.strip() for x in foo.split(";") if x.strip() ](to strip whitespace from each slice)
The "fastest" way to do this will depend on a lot of things⦠but you can easily experiment with ipython's %timeit:
In [1]: foo = "1;2;3;4;"
In [2]: %timeit foo.strip(";").split(";")
1000000 loops, best of 3: 1.03 us per loop
In [3]: %timeit filter(None, foo.split(';'))
1000000 loops, best of 3: 1.55 us per loop
Solution 2:[2]
How about this?
tuple(my_str.split(';')[:-1])
('str1', 'str2', 'str3')
You split the string at the ; character, and pass all off the substrings (except the last one, the empty string) to tuple to create the result tuple.
Solution 3:[3]
If you only expect an empty string at the end, you can do:
a = 'str1;str2;str3;'
tuple(a.split(';')[:-1])
or
tuple(a[:-1].split(';'))
Solution 4:[4]
Try tuple(my_str.split(';')[:-1])
Solution 5:[5]
Yes, that is quite a Pythonic way to do it. If you have a love for generator expressions, you could also replace the filter() with:
tuple(part for part in my_str.split(';') if part)
This has the benefit of allowing further processing on each part in-line.
It's interesting to note that the documentation for str.split() says:
... If sep is not specified or is None, any whitespace string is a separator and empty strings are removed from the result.
I wonder why this special case was done, without allowing it for other separators...
Solution 6:[6]
use split and then slicing:
my_str.split(';')[:-1]
or :
lis=[x for x in my_str.split(';') if x]
Solution 7:[7]
if number of items in your string is fixed, you could also de-structure inline like this:
(str1, str2, str3) = my_str.split(";")
more on that here: https://blog.teclado.com/destructuring-in-python/
Solution 8:[8]
I know this is an old question, but I just came upon this and saw that the top answer (David) doesn't return a tuple like OP requested. Although the solution works for the one example OP gave, the highest voted answer (Levon) strips the trailing semicolon with a substring, which would error on an empty string.
The most robust and pythonic solution is voithos' answer:
tuple(part for part in my_str.split(';') if part)
Here's my solution:
tuple(my_str.strip(';').split(';'))
It returns this when run against an empty string though:
('',)
So I'll be replacing mine with voithos' answer. Thanks voithos!
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | David Wolever |
| Solution 2 | xxx |
| Solution 3 | exfizik |
| Solution 4 | googler |
| Solution 5 | voithos |
| Solution 6 | Ashwini Chaudhary |
| Solution 7 | Sonic Soul |
| Solution 8 | Zenon Anderson |
