'reliably querying “delta since last read” in neo4j
In neo4j I have an application where an API endpoint does CRUD operations on the graph, then I materialize reachable parts of the graph starting at known nodes, and finally I send out the materialized subgraphs to a bunch of other machines that don’t know how to query neo4j directly. However, the materialized views are moderately large, and within a given minute only small parts of each one will change, so I’d like to be able to query “what has changed since last time I checked” so that I only have to send the deltas. What’s the best way to do that? I’m not sure if it helps, but my data doesn’t contain arbitrary-length paths — if needed I can explicitly write each node and edge type into my query.
One possibility I imagined was adding a “last updated” timestamp as a property on every node and edge, and instead of deleting things directly, just add a “deleted” boolean property and update the timestamp, and then use some background process to actually delete a few minutes later (after the deltas have been sent out). Then in my query, select all reachable nodes and edges and filter them based on the timestamp property. However:
- If there’s clock drift between two different neo4j write servers and the Raft leader changes from one to the other, can the timestamps go back in time? Or even worse, will two concurrent writes always give me a transaction time that is in commit order, or can they be reordered within a single box? I would rather use a graph-wide monotonically-increasing integer like the write commit ID, but I can’t find a function that gives me that. Or theoretically I could use the cookie used for causal consistency, but since you only get that after the transaction is complete, it’d be messy to have to do every write as two separate transactions.
- Also, it just sucks to use deletion markers because then you have to explicitly remove deleted edges / nodes in every other query you do.
Are there other better patterns here?
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|
