'Python zipfile: file name with new line characters

Somebody managed somehow to add a new line character \r\n to the name of a file in a zip, and that makes ZipFile fail when it tries to extract the zip:

2019-07-23 14:05:12,285 - __main__ - ERROR - Error desconocido: [Errno 22] Invalid argument: 'descargados\\03_26298_19\\ANEXO\r\n.pdf'. Saliendo.
Traceback (most recent call last):
  File "motor.py", line 51, in main
    procesar_descarga(zip_object, ruta_temp, ruta_final)
  File "C:\Users\david\pycharmProjects\descargueitor2\volcado.py", line 90, in procesar_descarga
    zip_object.extractall(str(ruta_temp))
  File "C:\Users\david\Anaconda3\lib\zipfile.py", line 1616, in extractall
    self._extract_member(zipinfo, path, pwd)
  File "C:\Users\david\Anaconda3\lib\zipfile.py", line 1670, in _extract_member
    open(targetpath, "wb") as target:
OSError: [Errno 22] Invalid argument: 'descargados\\03_26298_19\\ANEXO\r\n.pdf'

I tried the same file with several programs:

  • The built-in compressed files reader in Windows explorer just ignores the file: it is not listed nor extracted.
  • WinZip lists the file, but throws an error when opening or extracting the file.
  • 7Zip can read and extract the file: it just converts the bad characters to underscores.

Is there any way to deal with this in Python? It looks like files in a zip cannot be renamed using the library.



Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source