'How do I use a converter to produce categorical data in pandas?

I'm reading in a text file with pandas, and one of the columns contains IP addresses. I'd like to store this as categorical data, but convert the IPs to host names for the category labels. My plan was to use a converter and set the dtype to 'category':

@lru_cache(maxsize=None)
def lookup_hostname(ip_addr):
    try:
        return socket.gethostbyaddr(ip_addr)[0]
    except socket.herror:
        return ip_addr

measurements = pd.read_fwf(
    'measurements.log',
    names=[..., 'source', ...],
    converters={'source': lookup_hostname},
    dtype={'source': 'category'},
    ...
)

The problem is, it gives me a warning and leaves the dtype as object when I do it that way:

ParserWarning: Both a converter and dtype were specified for column source - only the converter will be used.

How can I do some preprocessing on that column but still get categorical data?



Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source