'csv reader opening files differently in Django and Apache

I need to parse a csv file inside my Django application. The csv file could have some non-ascii characters that I need to remove before processing. Here's what my code looks like

    with open(inputFile, newline='') as f:
        reader = csv.reader(f)
        row1 = next(reader)
        for element in row1:
            columnHeader = element.encode("ascii","ignore").decode("ascii").strip()

It works perfectly fine in Django standalone. But I get

"'ascii' codec can't decode byte 0xef in position 0: ordinal not in range(128)"

when I run it in production (Apache, mod_wsgi, Django). I have tried a slightly different formulation, but no luck.

            columnHeader = element.encode("ascii","ignore").decode()

I am new to Apache, Django and Python - so kind of running out of ideas.

(Both environments are on the same machine - Ubuntu).

Update 1 (3 work hours later): I tried to check if somehow a different python or csv module was getting loaded within Apache compared to Django standalone. By printing values of (sys.version) and (csv.'_ version _'). Negative. Same version in both contexts.

I looked at the logs. The failure is actually a couple of lines earlier than I initially suspected:

row1 = next(reader)


Solution 1:[1]

It turns out that for some reason, the Apache + mod_wsgi environment was defaulting to opening files with a different encoding.

Explicitly adding the encoding parameter like this solved my problem.

with open(inputFile, newline='', encoding='utf-8') as f:

In my line of work, I realistically only expect utf-8 or ascii encoded csv files (the users who upload these files use Microsoft excel to generate them). The above solution would work for both encodings.

If anyone has a need to support other encodings, I think the topic gets complicated fairly quickly.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Dr Phil