'Plpython3u breaking encoding on execute
Environment:
- Debian 11 with pl_PL locale(ISO-8859-2)
- Postgresql 13 database created with ISO 8859-2 encoding
- Plpython3u(Python 3.9.2)
A simple example of the problem:
CREATE OR REPLACE FUNCTION public.test()
RETURNS TEXT
LANGUAGE plpython3u
AS $function$
tmp = plpy.execute("SELECT field FROM table WHERE filter_pointing_at_a_single_row;")[0]['field']
plpy.execute("UPDATE table SET field='"+tmp+"' WHERE filter_pointing_at_a_single_row;")
$function$;
When the content is 'łódź', running this once results in it being changed to 'łódĹş'.
We have tens if not hundreds of functions that perform operations like that. The original solution used python2 with default encoding changed to iso8859-2, but it's time to upgrade and such trick won't work in python3.
Other observations:
CREATE OR REPLACE FUNCTION public.test()
RETURNS TEXT
LANGUAGE plpython3u
AS $function$
tmp = plpy.execute("SELECT field FROM table WHERE filter_pointing_at_a_single_row;")[0]['field']
ret = plpy.execute("SELECT '"+tmp+"' AS \"A\" ;")
return ret[0]['A']
$function$;
In psql with correct client-encoding as well as in DBeaver which forces utf-8 it returns 'łódź'.
Meanwhile:
CREATE OR REPLACE FUNCTION public.test()
RETURNS TEXT
LANGUAGE plpython3u
AS $function$
tmp = plpy.execute("SELECT field FROM table WHERE filter_pointing_at_a_single_row;")[0]['field']
return tmp
$function$;
Returns 'łódź' both in DBeaver and psql.
Edit: I didn't mention what I need. We're looking for the best solution. We're considering moving the database to UTF-8, or rewriting all the functions that need it if we find other solution. There's also hope that some smart fix exists that would minimise the amount of work needed to solve this.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|
