'while using df.to_json it created this character u00a0 in json how to remove in pandas dataframe

While using df.to_json it created this character u00a0 in json how to remove in pandas dataframe

here is the output of the json

[
    {
        "dx_code":"A000",
        "formatted_code":"A00.0",
        "valid_for_coding":"0.0",
        "short_desc":null,
        "long_desc":null,
        "list_id":"Chronic_Body_Sys",
        "option_id":"1",
        "title":"Infectious and parasitic\u00a0"
    },
    {
        "dx_code":"A00",
        "formatted_code":"A00",
        "valid_for_coding":0.0,
        "short_desc":"Cholera",
        "long_desc":"Cholera",
        "list_id":"Chronic_Body_System",
        "option_id":"1",
        "title":"Infectious and parasitic disease\u00a0"
    },
    {
        "dx_code":"A000",
        "formatted_code":"A00.0",
        "valid_for_coding":1.0,
        "short_desc":"Cholera due to Vibrio cholerae 01, biovar cholerae",
        "long_desc":"Cholera due to Vibrio cholerae 01, biovar cholerae",
        "list_id":"Chronic_Body_System",
        "option_id":"1",
        "title":"Infectious and parasitic disease\u00a0"
    },
    {
        "dx_code":"A001",
        "formatted_code":"A00.1",
        "valid_for_coding":1.0,
        "short_desc":"Cholera due to Vibrio cholerae 01, biovar eltor",
        "long_desc":"Cholera due to Vibrio cholerae 01, biovar eltor",
        "list_id":"Chronic_Body_System",
        "option_id":"1",
        "title":"Infectious and parasitic disease\u00a0"
    }
}

this the code I used

testdata.to_json('testfile.json',indent=4,orient='records')

this \u00a0 character is not present in the data and I don't know how to remove it any suggestion for this code I was using jupyter notebook working on a dataframe

Solution 1:^[1]

Looking here, the 00a0 character is a no-break space. Using to_json's force_ascii should turn that to a normal \n. Either way, deserializing (loading) this JSON should work just fine, as Python should know how to handle the character.

TL;DR It is the unicode character for a space with no break, and is added in for formatting reasons. use force_ascii if you want it gone, but reading this JSON should work just fine.

Solution 2:^[2]

You should be able to keep this character without issue.

If really you want to remove it, remember that to_json returns a string, so you can use a simple:

s = df.to_json().replace('\u00a0', '')

saving to file:

with open('testfile.json', 'w') as f:
    f.write(df.to_json(indent=4,orient='records').replace('\u00a0', ''))

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution	Source
Solution 1	sami-amer
Solution 2

'while using df.to_json it created this character u00a0 in json how to remove in pandas dataframe

Solution 1:[1]

Solution 2:[2]

saving to file:

Sources

Related Questions

Solution 1:^[1]

Solution 2:^[2]