'Trying to read csv with special characters from AWS S3

Im trying to read a csv from my bucket on S3 using AWS Lambda, but the csv have special characters from brazillian portuguese alphabet.

def rentabilidade_sintetica():

s3_client = boto3.client('s3')

bucket_name = "bucket-my"
s3_file_name = "historico/Rentabilidade Sintetica1.csv"
resp = s3_client.get_object(Bucket=bucket_name, Key=s3_file_name)
df_s3_data = pd.read_csv(resp['Body'], sep=';')

Because of that, i get the following error:

"errorMessage": "'utf-8' codec can't decode byte 0xf3 in position 4: unexpected end of data",

"errorType": "UnicodeDecodeError",

When I changed the special characters in the csv files, the code worked.

So I tried to specify the encoding parameter at read_csv:

df_s3_data = pd.read_csv(resp['Body'], sep=';', encoding='cp860')

but it gave me the same error.

Do you have any other method to read this?



Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source