'How to get list of strings from list-like string that includes nan?
Here is toy-example, I've string like this:
import numpy as np
z = str([np.nan, "ab", "abc"])
Printed it looks like "[nan, 'ab', 'abc']" but I've to process z = str([np.nan, "ab", "abc"])
I want to get from z list of strings excluding nan:
zz = ["ab", "abc"]
To be clear: z is input (string, that look list-like), zz is wanted output (list)
There is no problem if z doesn't contain nan, in such ast.literal_eval(z) do the job, but with nan I get error about malformed node or string.
Note: np.nan doesn't have to be first.
Solution 1:[1]
What about:
eval(z,{'nan':'nan'}) # if you can tolerate then:
[i for i in eval(z,{'nan':'nan'}) if i != 'nan']
It may have security considerations.
Solution 2:[2]
As I understand it, your goal is to parse csv or similar.
If you want a trade-off solution that should work in most cases, you can use a regex to get rid of the "nan". It will fail on the strings that contain the substring nan, (with comma), but this seems to be a reasonably unlikely edge case. Worth to explode with you real data.
z = str([np.nan, "ab", np.nan, "nan,", "abc", "x nan , y", "x nan y"])
import re
literal_eval(re.sub(r'\bnan\s*,\s*', '', z))
output: ['ab', '', 'abc', 'x y', 'x nan y']
Solution 3:[3]
ast.literal_eval is suggested over eval exactly because it allows a very limited set of statements. As stated in the docs: "Safely evaluate an expression node or a string containing a Python literal or container display. The string or node provided may only consist of the following Python literal structures: strings, bytes, numbers, tuples, lists, dicts, sets, booleans, None and Ellipsis." np.nan is none of those so it cannot be evaluated.
There are few choices to handle this.
- Remove
nanby operating on the string before doing evaluation on it. Might be problematic if you want to avoid also removing nan from inside the actual strings. - NOT ADVISED - SECURITY RISKS - standard
evalcan handle this if you define nan variable in the namespace - And finally, I think the best choice but also hardest to implement: like explained here, you take the source code for
ast, subclass it and reimplementliteral_evalin such a way that it knows how to handlenanstring on it's own.
Solution 4:[4]
Use filter() function:
list(filter(lambda f: type(f)==str, z))
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | Jacek Błocki |
| Solution 2 | mozway |
| Solution 3 | matszwecja |
| Solution 4 | pedram |
