'Unable to parse TAB in JSON files
I am running into a parsing problem when loading JSON files that seem to have the TAB character in them.
When I go to http://jsonlint.com/, and I enter the part with the TAB character:
{
"My_String": "Foo bar. Bar foo."
}
The validator complains with:
Parse error on line 2:
{ "My_String": "Foo bar. Bar foo."
------------------^
Expecting 'STRING', 'NUMBER', 'NULL', 'TRUE', 'FALSE', '{', '['
This is literally a copy/paste of the offending JSON text.
I have tried loading this file with json and simplejson without success. How can I load this properly? Should I just pre-process the file and replace TAB by \t or by a space? Or is there anything that I am missing here?
Update:
Here is also a problematic example in simplejson:
foo = '{"My_string": "Foo bar.\t Bar foo."}'
simplejson.loads(foo)
JSONDecodeError: Invalid control character '\t' at: line 1 column 24 (char 23)
Solution 1:[1]
Tabs are legal as delimiting whitespace outside of values, but not within strings. To get a tab inside a JSON string you need to use the sequence \t instead.
But beware multiple levels of interpretation. This Python string from your update:
foo = '{"My_string": "Foo bar.\t Bar foo."}'
is not valid JSON, because the Python interpreter turns that \t sequence into an actual tab character before the JSON processor ever sees it.
You can tell Python to put a literal \t in the string instead of a tab character by doubling the backslash:
foo = '{"My_string": "Foo bar.\\t Bar foo."}'
Or you can use the "raw" string syntax, which doesn't interpret any special backslash sequences:
foo = r'{"My_string": "Foo bar.\t Bar foo."}'
Either way, the JSON processor will see a string containing a backslash followed by a 't', rather than a string containing a tab.
Solution 2:[2]
You can include tabs within values (instead of as whitespace) in JSON files by escaping them. Here's a working example with the json module in Python2.7:
>>> import json
>>> obj = json.loads('{"MY_STRING": "Foo\\tBar"}')
>>> obj['MY_STRING']
u'Foo\tBar'
>>> print obj['MY_STRING']
Foo Bar
While not escaping the '\t' causes an error:
>>> json.loads('{"MY_STRING": "Foo\tBar"}')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/json/__init__.py", line 338, in loads
return _default_decoder.decode(s)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/json/decoder.py", line 365, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/json/decoder.py", line 381, in raw_decode
obj, end = self.scan_once(s, idx)
ValueError: Invalid control character at: line 1 column 19 (char 18)
Solution 3:[3]
Just to share my experience:
I am using snakemake and a config file written in Json. There are tabs in the json file for indentation. TAB are legal for this purpose. But I am getting error message: snakemake.exceptions.WorkflowError: Config file is not valid JSON or YAML. I believe this is a bug of snakemake; but I could be wrong. Please comment. After replacing all TABs with spaces the error message is gone.
Solution 4:[4]
In node-red flow i facing same type of problem:
flow.set("delimiter",'"\t"');
error:
{ "status": "ERROR", "result": "Cannot parse config: String: 1: in value for key 'delimiter': JSON does not allow unescaped tab in quoted strings, use a backslash escape" }
solution:
i added in just \\t in the code.
flow.set("delimiter",'"\\t"');
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | |
| Solution 2 | mdml |
| Solution 3 | Kemin Zhou |
| Solution 4 | KARTHIKEYAN.A |
