'dynamically convert python datatype

I have a use case where I am reading some data from an API call, but need to transform the data before inserting it into a database. The data comes in a integer format, and I need to save it as a string. The database does not offer a datatype conversion, so the conversion needs to happen in Python before inserting.

Within a config file I have like:

config = {"convert_fields": ["payment", "cash_flow"], "type": "str"}

Then within python I am using the eval() function to check what type to convert the fields to.

So the code ends up being like data['field'] = eval(config['type'])(data['field'])

Does anyone have a better suggestion how I can dynamically change these values, maybe without storing the python class type within a config file.

To add, like sure I could just do str(), but there may be a need to have other fields to convert at some point, which are not string. So I want it to be dynamic, from whatever is defined in the config file for the required conversion fields.



Solution 1:[1]

How about using getattr() and __builtins__ that I feel is a little better than exec()/eval() in this instance.

def cast_by_name(type_name, value):
    return getattr(__builtins__, type_name)(value)
print(cast_by_name("bool", 1))

Should spit back:

True

You will likely want to include some support for exceptions and perhaps defaults but this should get you started.

@mistermiyagi Points out a critical flaw that of course eval is a bulitin as well. We might want to limit this to safe types:

def cast_by_name(type_name, value):
    trusted_types = ["int", "float", "complex", "bool", "str"] ## others as needed
    if type_name in trusted_types:
        return getattr(__builtins__, type_name)(value)
    return value
print(cast_by_name("bool", 1))

Solution 2:[2]

Build up a conversion lookup dictionary in advance.

  • Faster
  • Easier to debug

config = {"convert_fields": 
    {"payment" : "str", "cash_flow" : "str", "customer_id" : "int", "name" : "name_it"}
}

def name_it(s : str):
    return s.capitalize()

data_in = dict(
    payment = 101.00,
    customer_id = 3,
    cash_flow = 1,
    name = "bill",
    city = "london"
)

convert_functions = {
    #support builtins and custom functions
    fieldname : globals().get(funcname) or getattr(__builtins__, funcname) 
    for fieldname, funcname in config["convert_fields"].items()
    if not funcname in {"eval"}
}

print(f"{convert_functions=}")

data_db = {
    fieldname : 
    #if no conversion is specified, use `str`
    convert_functions.get(fieldname, str)(value) 
    for fieldname, value in data_in.items()
    }

print(f"{data_db=}")

Output:

convert_functions={'payment': <class 'str'>, 'cash_flow': <class 'str'>, 'customer_id': <class 'int'>, 'name': <function name_it at 0x10f0fbe20>}
data_db={'payment': '101.0', 'customer_id': 3, 'cash_flow': '1', 'name': 'Bill', 'city': 'london'}

if the config could be stored in code, rather than a json-type approach, I'd look into Pydantic though that is not exactly your problem space here:

from pydantic import BaseModel

class Data_DB(BaseModel):
    payment : str
    customer_id : int 
    cash_flow : str
    #you'd need a custom validator to handle capitalization
    name : str
    city : str

pydata = Data_DB(**data_in)
print(f"{pydata=}")
print(pydata.dict())


output:

pydata=Data_DB(payment='101.0', customer_id=3, cash_flow='1', name='bill', city='london')
{'payment': '101.0', 'customer_id': 3, 'cash_flow': '1', 'name': 'bill', 'city': 'london'}

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1
Solution 2