'Unable to read multiple yaml files in zipped folder
I want to read multiple yaml files containing in two zipped folders. The zipped folder contains nested folders.
The directory structure is as below:
C:
- file1.zip
- aa
- bb
- cc
- x.yaml
- y.yaml
- z.yaml
- file2.zip
- aa
- bb
- cc
- x.yaml
- y.yaml
- z.yaml
I am unable to read the yaml files within the zipped folder. I am getting error as File Not Found Exception even though the file is present.
I have pasted the code below.
def yaml_as_dict(filename):
my_dict = {}
with open(filename, 'r') as fp:
docs = yaml.safe_load_all(fp)
for doc in docs:
for key, value in doc.items():
my_dict[key] = value
return my_dict
def extract_file1(file1):
with zipfile.ZipFile(file1) as zip:
text_files = zip.namelist()
for t in text_files:
print(t)
yaml_as_dict(t)
def extract_file2(file2):
with zipfile.ZipFile(file2) as zip:
text_files = zip.namelist()
for t in text_files:
print(t)
yaml_as_dict(t)
Solution 1:[1]
As I said in a comment, the members of the .zip archive must first be extracted (and uncompressed) into a actual file order to apply your yaml_as_dict() function to them. You also don't need to have a different extract() function for each filename when the only thing different between it the name of the file being process.
The code below illustrates how all of these things can be done:
import os
from pprint import pprint
from tempfile import TemporaryDirectory
import yaml
import zipfile
def yaml_as_dict(filename):
my_dict = {}
with open(filename, 'r') as fp:
docs = yaml.safe_load_all(fp)
for doc in docs:
for key, value in doc.items():
my_dict[key] = value
return my_dict
def extract_files(zip_file_path):
"""Extract YMAL files from zip file and convert them to Python dicts."""
with TemporaryDirectory() as tempdir:
with zipfile.ZipFile(zip_file_path) as zip:
for mbr_info in (info for info in zip.infolist() if not info.is_dir()):
# Extract (and print) data from yaml files.
if os.path.splitext(mbr_info.filename)[1].lower() == '.yaml':
tmpfile_path = zip.extract(mbr_info, path=tempdir)
print(f'=> Contents of {mbr_info.filename!r}:')
as_dict = yaml_as_dict(tmpfile_path)
pprint(as_dict, width=90)
print()
if __name__ == '__main__':
zip_file_path1 = r'C:\path\to\YAML_Files1.zip'
extract_files(zip_file_path1)
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | martineau |
