'Event Sourced many-to-many aggregates
We have an event sourced system where we are caching data into aggregates for performance reasons.
Let's say we have 3 entities. Patient, Doctor, Appointment.
What works well for us is the one-to-many type relations. Example, imagine the following events:
* DOCTOR_CREATED
* DOCTOR_ARRIVED
* DOCTOR_LEFT
This we could aggregate into a one-to-many type relation. When a Doctor is added to the system, we can create a row in the database.
Every time they arrive to work, we can add that time to the doctor record, and every time they leave we can also add that. So we could end up with something like this:
{
"id": 23,
"name": "Dr Bill",
"totalHoursWorked": 124,
"timesheet": [
{
"arrivedAt": "2022-01-04 09:00:00",
"leftAt": "2022-01-04 14:00:00",
"hoursWorked": 5,
},
// etc...
]
}
No problems there.
Now let's suppose that we want to track the appointments. This is a many-to-many relation between users and doctors.
I'm interested in these events:
* DOCTOR_CREATED
* PATIENT_CREATED
* APPOINTMENT_CREATED
Because the event stream must be sequential through time, I cannot create an appointment record BEFORE either the relevant doctor or patient has been created.
How do I go about creating a view of the data model from the perspective of the appointments.
Maybe thinking about it in graphql terminology might help, but I want to optimise this query:
query {
appointment {
day
time
patient {
name
age
}
doctor {
name
specialty
}
}
}
I'd love to be able to store this data structure in the database as an aggregate. So that fetching appointment by ID can be done in one database query.
I'm going to run into problems here with timelines, because if I wait for the first APPOINTMENT_CREATED event before I create a row in the database, then I've missed the relevant PATIENT and DOCTOR events.
If I capture the PATIENT and DOCTOR events first in anticipation, then I have to store all possible combinations of Doctor and Patient just on the off-chance one of them might want to have an appointment later on. I'm also faced with this issue of the aggregate having an inconsistent data structure. Rows in the table might be indexed by doctor-patient id, or appointment id depending on what stage of the event stream we have got to.
The only way I can currently think to do this, is to have the aggregate trying to optimise this query wait for the APPOINTMENT_CREATED event, and then have to asynchronously query the database to retrieve the patient and doctor records at that point in time.
Although the way we've implemented our system, all our aggregates are built up from composition of pure functions which just take the previous aggregate state, the event in question, and return the new aggregate state.
Is what I want impossible with the architecture we have built so far? Do I need an escape hatch to allow our aggregate hydration to perform async db queries (not keen on this)?
Or is an aggregate the wrong technique to be solving this problem, and that I actually need to use something else (like a cache). Although having said that, one of the benefits sold to me of going event-sourced was that we wouldn't have to bother with caching as we can just pre-build all our aggregates to be read-optimised for the front end.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|
