'Unable to parse TAB in JSON files

I am running into a parsing problem when loading JSON files that seem to have the TAB character in them.

When I go to http://jsonlint.com/, and I enter the part with the TAB character:

{
    "My_String": "Foo bar.  Bar foo."
}

The validator complains with:

Parse error on line 2:
{    "My_String": "Foo bar. Bar foo."
------------------^
Expecting 'STRING', 'NUMBER', 'NULL', 'TRUE', 'FALSE', '{', '['

This is literally a copy/paste of the offending JSON text.

I have tried loading this file with json and simplejson without success. How can I load this properly? Should I just pre-process the file and replace TAB by \t or by a space? Or is there anything that I am missing here?

Update:

Here is also a problematic example in simplejson:

foo = '{"My_string": "Foo bar.\t Bar foo."}'
simplejson.loads(foo)

JSONDecodeError: Invalid control character '\t' at: line 1 column 24 (char 23)


Solution 1:[1]

Tabs are legal as delimiting whitespace outside of values, but not within strings. To get a tab inside a JSON string you need to use the sequence \t instead.

But beware multiple levels of interpretation. This Python string from your update:

foo = '{"My_string": "Foo bar.\t Bar foo."}'

is not valid JSON, because the Python interpreter turns that \t sequence into an actual tab character before the JSON processor ever sees it.

You can tell Python to put a literal \t in the string instead of a tab character by doubling the backslash:

foo = '{"My_string": "Foo bar.\\t Bar foo."}'

Or you can use the "raw" string syntax, which doesn't interpret any special backslash sequences:

foo = r'{"My_string": "Foo bar.\t Bar foo."}'

Either way, the JSON processor will see a string containing a backslash followed by a 't', rather than a string containing a tab.

Solution 2:[2]

You can include tabs within values (instead of as whitespace) in JSON files by escaping them. Here's a working example with the json module in Python2.7:

>>> import json
>>> obj = json.loads('{"MY_STRING": "Foo\\tBar"}')
>>> obj['MY_STRING']
u'Foo\tBar'
>>> print obj['MY_STRING']
Foo    Bar

While not escaping the '\t' causes an error:

>>> json.loads('{"MY_STRING": "Foo\tBar"}')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/json/__init__.py", line 338, in loads
    return _default_decoder.decode(s)
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/json/decoder.py", line 365, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/json/decoder.py", line 381, in raw_decode
    obj, end = self.scan_once(s, idx)
ValueError: Invalid control character at: line 1 column 19 (char 18)

Solution 3:[3]

Just to share my experience:

I am using snakemake and a config file written in Json. There are tabs in the json file for indentation. TAB are legal for this purpose. But I am getting error message: snakemake.exceptions.WorkflowError: Config file is not valid JSON or YAML. I believe this is a bug of snakemake; but I could be wrong. Please comment. After replacing all TABs with spaces the error message is gone.

Solution 4:[4]

In node-red flow i facing same type of problem:

flow.set("delimiter",'"\t"');

error:

{ "status": "ERROR", "result": "Cannot parse config: String: 1: in value for key 'delimiter': JSON does not allow unescaped tab in quoted strings, use a backslash escape" }  

solution:

i added in just \\t in the code.

 flow.set("delimiter",'"\\t"');

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1
Solution 2 mdml
Solution 3 Kemin Zhou
Solution 4 KARTHIKEYAN.A