'Extracting Data from a DICOMDIR file using Pydicom
I'm unable to read in a DICOM file as I usually would, citing the error:
AttributeError: 'DicomDir' object has no attribute 'DirectoryRecordSequence'
I've tried:
- pydicom.fileset.FileSet
- using specific tags with dcmread
- pydicom.filereader.read_dicomdir
- pydicom.filereader.read_partial
- using force=True in dcmread
pydicom.filereader.read_file_meta_info is about the only thing that's not returned an error and yields;
(0002, 0000) File Meta Information Group Length UL: 172
(0002, 0001) File Meta Information Version OB: b'\x00\x01'
(0002, 0002) Media Storage SOP Class UID UI: Media Storage Directory Storage
(0002, 0003) Media Storage SOP Instance UID UI: 2.25.330614241706723499239981063503184149269
(0002, 0010) Transfer Syntax UID UI: Explicit VR Little Endian
(0002, 0012) Implementation Class UID UI: 1.3.6.1.4.1.30071.8
(0002, 0013) Implementation Version Name SH: 'fo-dicom 4.0.7'
Moreover, the image is supposed to be a regular DICOM file, not a DICOMDIR. I can open the file in ImageJ and view header information there so I know the data is recoverable. Is there a way for me to read in this file in Python or alternatively force it to ignore looking for DirectoryRecordSequence?
Edit: Code and stacktrace from using FileSet:
from pydicom.fileset import FileSet
fs = FileSet("unprocessed.dcm")
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-4-2b6ba2e435fe> in <module>
1 from pydicom.fileset import FileSet
----> 2 fs = FileSet("unprocessed.dcm")
c:\****\appdata\local\programs\python\python38-32\lib\site-packages\pydicom\fileset.py in __init__(self, ds)
998 # Check the DICOMDIR dataset and create the record tree
999 if ds:
-> 1000 self.load(ds)
1001 else:
1002 # New File-set
c:\****\appdata\local\programs\python\python38-32\lib\site-packages\pydicom\fileset.py in load(self, ds_or_path, include_orphans, raise_orphans)
1641 ds = ds_or_path
1642 else:
-> 1643 ds = dcmread(ds_or_path)
1644
1645 sop_class = ds.file_meta.get("MediaStorageSOPClassUID", None)
c:\****\appdata\local\programs\python\python38-32\lib\site-packages\pydicom\filereader.py in dcmread(fp, defer_size, stop_before_pixels, force, specific_tags)
1027 stop_when = _at_pixel_data
1028 try:
-> 1029 dataset = read_partial(
1030 fp,
1031 stop_when,
c:\****\appdata\local\programs\python\python38-32\lib\site-packages\pydicom\filereader.py in read_partial(fileobj, stop_when, defer_size, force, specific_tags)
879 DeprecationWarning
880 )
--> 881 ds = DicomDir(
882 fileobj,
883 dataset,
c:\****\appdata\local\programs\python\python38-32\lib\site-packages\pydicom\dicomdir.py in __init__(self, filename_or_obj, dataset, preamble, file_meta, is_implicit_VR, is_little_endian)
94
95 self.patient_records: List[Dataset] = []
---> 96 self.parse_records()
97
98 def parse_records(self) -> None:
c:\****\appdata\local\programs\python\python38-32\lib\site-packages\pydicom\dicomdir.py in parse_records(self)
125
126 # Build the mapping from file offsets to records
--> 127 records = self.DirectoryRecordSequence
128 if not records:
129 return
c:\****\appdata\local\programs\python\python38-32\lib\site-packages\pydicom\dataset.py in __getattr__(self, name)
834 return {}
835 # Try the base class attribute getter (fix for issue 332)
--> 836 return object.__getattribute__(self, name)
837
838 @property
AttributeError: 'DicomDir' object has no attribute 'DirectoryRecordSequence'
Solution 1:[1]
pydicom reads the dataset correctly, but because it identifies as Media Storage Directory it gets processed by the deprecated DicomDir class, even when passed directly to the FileSet class. Because the dataset isn't a valid Media Storage Directory instance this fails, producing the exception seen.
You should be able to fix this by changing the file meta information's (0002,0002) Media Storage SOP Class UID during read:
from pydicom import dcmread
from pydicom import config
def fix_sop_class(elem, **kwargs):
if elem.tag == 0x00020002:
# DigitalXRayImageStorageForProcessing
elem = elem._replace(value=b"1.2.840.10008.5.1.4.1.1.1.1.1")
return elem
config.data_element_callback = fix_sop_class
ds = dcmread('path/to/file')
By changing the SOP Class UID, that processing is skipped and the dataset returned.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 |
