'Facing issues while parsing an IFC file of 1 GB size with IfcOpenShell using Multiprocessing
I am working on parsing IFC file (having size of 1 GB)using IfcOpenShell module of python to get all nodes, relations and properties which I will insert in graph database.
For fast IFC file parsing I am using multiprocessing in python. While using multiple processes (8 in My case as my CPU is of 8 cores) I am dividing the IFC file data in 8 list and then starting the 8 processes and creating Nodes and Edges for Graph database insertion
Below are the issues I am facing while using multiprocessing:
- I need to open the file for reading in each process as the file object we get by opening a file with 'ifcopenshell.open()' method is non pickleable and hence can not be passed as argument to 'multiprocessing.Process()'.
- If I try to pickle the file object I get 'cannot pickle 'SwigPyObject' object' exception.
- As I need to open file for reading in each process it is consuming complete RAM of my machine (32 GB) and crashing the VS-Code editor.
I have also tried running this in single thread but it takes huge amount of time. I had tried using the Multithreading option in Python but afterwards found that in Python Multithreading is not possible due to Global interpreter lock. Does this have a better solution or approach for parsing IFC file of 1 GB size and creating nodes and edges?
I have been thinking on a solution as below: To divide an large sized IFC file into number of smaller files and parse them individually to get final data
Is it possible to do so ? If Yes How ?
If you have another good approach/solution please suggest.
Thank you in advance.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|
