'auraDB MemoryPoolOutOfMemoryError when running upload
I'm a backend engineer with the sole task of implementing a knowledge graph for SNOMED CT in neo4j. We have a neo4j aura instance (the cheapest one 1gb ram 2gb storage) When runnig the upload of these very large csv files using pyingest I get the following error:
{message: The allocation of an extra 15.1 MiB would use more than the limit 100.0 MiB. Currently using 88.4 MiB. dbms.memory.transaction.global_max_size threshold reached}
Am I not paying for 1gb? Am I doing something terribly stupid or wrong?
I've read the graph databases book (the one with the octopus). What else should I read? I've fallen into this beautiful rabbit hole and find new insights each day, but I have to deliver and be practical 😅
I'll appreciate any practical guidance and book/learning resource recommendation again, sorry if this is the wrong place to ask!
server_uri: neo4j+s://id:7687
admin_user: neo4j
admin_pass: pass
files:
# concepts
- url: /home/gocandra/workspace/uma/deep-learning/research/graphs/snomed-loader/csv/Concept_Snapshot.csv
compression: none
skip_file: false
chunk_size: 100
cql: |
WITH $dict.rows as rows UNWIND rows as row
MERGE (c:Concept {conceptId:row.id,term:row.term,descType:row.descType})
ON CREATE SET c.conceptId = row.id, c.term = row.term, c.descType = row.descType
ON MATCH SET c.conceptId = row.id, c.term = row.term, c.descType = row.descType
## concept synonim generator
- url: /home/gocandra/workspace/uma/deep-learning/research/graphs/snomed-loader/csv/Concept_Snapshot_add.csv
compression: none
skip_file: false
chunk_size: 50
cql: |
WITH $dict.rows as rows UNWIND rows as row
MATCH (dest:Concept) WHERE dest.conceptId = row.id
CREATE (c:Concept:Synonym{
conceptId: row.id,
term: row.term,
descType: row.descType
})-[r:IS_A {
relId:'116680003',
term:'Is a (attribute)',
descType:'900000000000003001'
}]->(dest);
# relationships
- url: /home/gocandra/workspace/uma/deep-learning/research/graphs/snomed-loader/csv/Concept_Snapshot_add.csv
compression: none
skip_file: false
chunk_size: 50
cql: |
WITH $dict.rows as rows UNWIND rows as row
MATCH (source:Concept) WHERE source.conceptId = row.sourceId
MATCH (dest:Concept:FSA) WHERE dest.conceptId = row.destinationId
CREATE (source)-[r:row.relLabel{relId: row.typeId, term: row.term, descType: row.descType}]->(dest)"
that's the config.yml with all the queries (i'm chunking to try and avoid this issue)
{code: Neo.TransientError.General.MemoryPoolOutOfMemoryError} {message: The allocation of an extra 7.3 MiB would use more than the limit 100.0 MiB. Currently using 99.0 MiB. dbms.memory.transaction.global_max_size threshold reached}
now I get this error, I'm not running any other queries on the database, nor is anyone else (i'm the only one with credentials)
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|
