'while using df.to_json it created this character u00a0 in json how to remove in pandas dataframe
While using df.to_json it created this character u00a0 in json how to remove in pandas dataframe
here is the output of the json
[
{
"dx_code":"A000",
"formatted_code":"A00.0",
"valid_for_coding":"0.0",
"short_desc":null,
"long_desc":null,
"list_id":"Chronic_Body_Sys",
"option_id":"1",
"title":"Infectious and parasitic\u00a0"
},
{
"dx_code":"A00",
"formatted_code":"A00",
"valid_for_coding":0.0,
"short_desc":"Cholera",
"long_desc":"Cholera",
"list_id":"Chronic_Body_System",
"option_id":"1",
"title":"Infectious and parasitic disease\u00a0"
},
{
"dx_code":"A000",
"formatted_code":"A00.0",
"valid_for_coding":1.0,
"short_desc":"Cholera due to Vibrio cholerae 01, biovar cholerae",
"long_desc":"Cholera due to Vibrio cholerae 01, biovar cholerae",
"list_id":"Chronic_Body_System",
"option_id":"1",
"title":"Infectious and parasitic disease\u00a0"
},
{
"dx_code":"A001",
"formatted_code":"A00.1",
"valid_for_coding":1.0,
"short_desc":"Cholera due to Vibrio cholerae 01, biovar eltor",
"long_desc":"Cholera due to Vibrio cholerae 01, biovar eltor",
"list_id":"Chronic_Body_System",
"option_id":"1",
"title":"Infectious and parasitic disease\u00a0"
}
}
this the code I used
testdata.to_json('testfile.json',indent=4,orient='records')
this \u00a0 character is not present in the data and I don't know how to remove it any suggestion for this code I was using jupyter notebook working on a dataframe
Solution 1:[1]
Looking here, the 00a0 character is a no-break space. Using to_json's force_ascii should turn that to a normal \n. Either way, deserializing (loading) this JSON should work just fine, as Python should know how to handle the character.
TL;DR
It is the unicode character for a space with no break, and is added in for formatting reasons. use force_ascii if you want it gone, but reading this JSON should work just fine.
Solution 2:[2]
You should be able to keep this character without issue.
If really you want to remove it, remember that to_json returns a string, so you can use a simple:
s = df.to_json().replace('\u00a0', '')
saving to file:
with open('testfile.json', 'w') as f:
f.write(df.to_json(indent=4,orient='records').replace('\u00a0', ''))
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | sami-amer |
| Solution 2 |
