'read docx file error [closed]
import docx2txt
my_text=docx2txt.process("file1.docx")
print(my_text)
when I want to read the docx file from this code it shows the following error:
File "/usr/lib/python3.5/zipfile.py", line 1093, in _RealGetContents
raise BadZipFile("File is not a zip file")
zipfile.BadZipFile: File is not a zip file
Solution 1:[1]
As @cowbert mentioned in the comment section, your file likely has been corrupted or it's in a zip format. Your provided code is correct. You can also use textract which supports .docx files:
import textract
text = textract.process("path/to/file.extension")
This package is built on top of several python packages and other source libraries. Once you install it, several packages (including docx2txt) are all installed by default with this package.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | micstr |
