'dynamically convert python datatype
I have a use case where I am reading some data from an API call, but need to transform the data before inserting it into a database. The data comes in a integer format, and I need to save it as a string. The database does not offer a datatype conversion, so the conversion needs to happen in Python before inserting.
Within a config file I have like:
config = {"convert_fields": ["payment", "cash_flow"], "type": "str"}
Then within python I am using the eval() function to check what type to convert the fields to.
So the code ends up being like data['field'] = eval(config['type'])(data['field'])
Does anyone have a better suggestion how I can dynamically change these values, maybe without storing the python class type within a config file.
To add, like sure I could just do str(), but there may be a need to have other fields to convert at some point, which are not string. So I want it to be dynamic, from whatever is defined in the config file for the required conversion fields.
Solution 1:[1]
How about using getattr() and __builtins__ that I feel is a little better than exec()/eval() in this instance.
def cast_by_name(type_name, value):
return getattr(__builtins__, type_name)(value)
print(cast_by_name("bool", 1))
Should spit back:
True
You will likely want to include some support for exceptions and perhaps defaults but this should get you started.
@mistermiyagi Points out a critical flaw that of course eval is a bulitin as well. We might want to limit this to safe types:
def cast_by_name(type_name, value):
trusted_types = ["int", "float", "complex", "bool", "str"] ## others as needed
if type_name in trusted_types:
return getattr(__builtins__, type_name)(value)
return value
print(cast_by_name("bool", 1))
Solution 2:[2]
Build up a conversion lookup dictionary in advance.
- Faster
- Easier to debug
config = {"convert_fields":
{"payment" : "str", "cash_flow" : "str", "customer_id" : "int", "name" : "name_it"}
}
def name_it(s : str):
return s.capitalize()
data_in = dict(
payment = 101.00,
customer_id = 3,
cash_flow = 1,
name = "bill",
city = "london"
)
convert_functions = {
#support builtins and custom functions
fieldname : globals().get(funcname) or getattr(__builtins__, funcname)
for fieldname, funcname in config["convert_fields"].items()
if not funcname in {"eval"}
}
print(f"{convert_functions=}")
data_db = {
fieldname :
#if no conversion is specified, use `str`
convert_functions.get(fieldname, str)(value)
for fieldname, value in data_in.items()
}
print(f"{data_db=}")
Output:
convert_functions={'payment': <class 'str'>, 'cash_flow': <class 'str'>, 'customer_id': <class 'int'>, 'name': <function name_it at 0x10f0fbe20>}
data_db={'payment': '101.0', 'customer_id': 3, 'cash_flow': '1', 'name': 'Bill', 'city': 'london'}
if the config could be stored in code, rather than a json-type approach, I'd look into Pydantic though that is not exactly your problem space here:
from pydantic import BaseModel
class Data_DB(BaseModel):
payment : str
customer_id : int
cash_flow : str
#you'd need a custom validator to handle capitalization
name : str
city : str
pydata = Data_DB(**data_in)
print(f"{pydata=}")
print(pydata.dict())
output:
pydata=Data_DB(payment='101.0', customer_id=3, cash_flow='1', name='bill', city='london')
{'payment': '101.0', 'customer_id': 3, 'cash_flow': '1', 'name': 'bill', 'city': 'london'}
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | |
| Solution 2 |
