'Is it a bad idea to keep an EntityManager open for the duration of the application lifetime?

I am writing an application in Java SE 8 and have recently migrated the database system from raw JDBC code to JPA. The interface itself is so much simpler, but I am running into an issue with the way I have designed my code which does not work well with JPA and I am unsure of how to proceed.

The primary issue I am having is that I cannot store references to my entities in code for any period of time anymore, because they immediately become out-of-date. I used to have a central persistence context where the one "true" instance of all my entities were always stored in code, and changes made to them would always be reflected everywhere because there were no duplicate instances. I realize this is not smart design when it comes to memory efficiency, but that allowed me to, for instance, implement the observer pattern and guarantee that any entity updates would be immediately visible in GUIs. But now, as soon as I load an entity from the database using JPA and close the EntityManager (as I have read so often that you must do), that instance merely represents a snapshot in time from when it was loaded and my GUIs will be waiting for updates from a dead object. Loading that entity from elsewhere in the code and making a change will do nothing, as it is a different instance altogether, with an empty list of subscribers (transient). There are a lot more cases in my code where I attempt to hold a reference to an entity for whatever purpose, and a lot of them rely on those entities being up-to-date.

I know that EntityManager is intended to be a short-lived object, but now I am thinking that it maybe wouldn't be such a bad idea after all to keep an EntityManager open for the lifetime of my program to replace that construct that I had in my old code. I quite frankly don't understand what the point of closing EntityManager so quickly is - isn't it beneficial to have your entities managed over a longer period of time? When I was first reading about how changes to managed entities are detected and persisted automatically, I hoped that that would allow me to completely detach my business logic from my persistence layer, and trust that all my changes were being saved. It was rather disillusioning to discover that in order for those entities to be managed in the first place, I would have to leave the EntityManager open for the duration of that business logic. And that would require them to be scoped higher than the method they are created in, so I could close them later. But all the literature implores the use of short-lived, low-scoped EntityManagers, which just seems like a direct contradiction.

I am somewhat at a loss for how to proceed. I would love to make full use of JPA and all of its extremely useful features, but I feel like I might be missing the point of EntityManager being short-lived. It seems like it would be so much more convenient long-lived. Can anyone give me some guidance?



Solution 1:[1]

Your central 'cache' with a single instance of data is a common idea, but it is difficult to manage. Some orm/JPA providers have caching built in and maintain something similar (check out EclipeLink's shared cache) but they usually have complex mechanisms that allow for limiting and managing what could be endless amounts of data that can quickly become stale. EclipseLink has tie ins to the database to get notifications when data changes, and can be configured for cache coordination when being run in different servers. Without such capabilities, your cache will be stale - and worse, your cache will have great difficulty maintaining transactional isolation. Any change to those cached objects is immediately visible to all processes, regardless of the transaction going through to the database or rolling back. Use of JPA is meant to guarantee that you only see committed data (excluding the changes you've made in the current transaction/unit of work).

To answer your specific question about keeping an EM open as generally to JPA providers: EntityManagers keep hooks to the entities read in through them so that they can track and manage all changes made to them. This can lead to very large amounts of data being held - check the forum for memory leak questions, as keeping EMs open for an extended period is the cause of quite a few. You gain object identity, but have to realize it comes at the cost of tracking everything read in through them - so you will likely have to occasionally clear the memory (em.clear()) at some key points, or find provider specific mechanics to dereference what it might be holding onto so GC can do its thing.

Other draw backs are that the EntityManager then itself becomes very large and difficult to merge changes into. Depending on how you merge changes into your app, you'll need a way to get those changes into your database. Having JPA go through very large sets of entities that builds over time to find changes to a small dataset is very inefficient, and you'll still have to find ways to refresh these entities if change are done through other entityManagers or applications.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Chris