'@Fetch(FetchMode.SUBSELECT) not working when using paging
I have a class which contains many OneToMany lists (they have to be lists), some of which also have OneToMany Lists embedded in them. I need to pull all of the data from the JPA repo in order to mass index it on an external service.
I noticed then when using findAll() the N+1 issue occurs, and so I am trying to reduce the time taken. EntityGraph and JOIN FETCH don't seem to be of any help due to the multipleBag issue, but @Fetch(FetchMode.SUBSELECT) works perfectly.
The issue I have occurs when I try to page the results, which I need to do in order to scale this to larger data sets, as the application runs out of memory when reaching around 200k results. Implementing paging works, however if the page size is smaller than the data set (which it will obviously be to break up the data) the same N+1 issue occurs. For a data set of 10000 objects with a page size of 10000 indexing takes around 11 seconds and 11 sql statements, page size 9999 takes 12 seconds but with 22 sql statements, page size 9800 takes 24 seconds with 350 sql statements, page size 2000 takes 56 seconds and my console only goes back 42000 sql statements, you get the idea.
Is there anyway I can implement some form of paging whilst using the FetchMode.SUBSELECT? If not is there another way that I can batch process the findAll query, whilst avoiding the N+1 issue?
Solution 1:[1]
Something like this is best solved with keyset pagination which is the perfect use case for Blaze-Persistence Entity Views.
Blaze-Persistence is a query builder on top of JPA which supports many of the advanced DBMS features on top of the JPA model. I created Entity Views on top of it to allow easy mapping between JPA models and custom interface defined models, something like Spring Data Projections on steroids. The idea is that you define your target structure the way you like and map attributes(getters) via JPQL expressions to the entity model. Since the attribute name is used as default mapping, you mostly don't need explicit mappings as 80% of the use cases is to have DTOs that are a subset of the entity model.
A mapping for your model could look as simple as the following
@EntityView(MainEntity.class)
interface MainEntityDto{
Long getId();
String getName();
@Mapping(fetch = SUBSELECT)
List<SubDto1> getSubs1();
@Mapping(fetch = SUBSELECT)
List<SubDto2> getSubs2();
}
@EntityView(SubEntity1.class)
interface SubDto1 {
Long getId();
String getName();
}
@EntityView(SubEntity2.class)
interface SubDto2 {
Long getId();
String getName();
@Mapping(fetch = SUBSELECT)
List<SubDto3> getSubs3();
}
@EntityView(SubEntity3.class)
interface SubDto3 {
Long getId();
String getName();
}
Querying is a matter of applying the entity view to a query, the simplest being just a query by id.
MainEntityDto dto = entityViewManager.find(entityManager, MainEntityDto.class, id);
The Spring Data integration allows you to use it almost like Spring Data Projections: https://persistence.blazebit.com/documentation/entity-view/manual/en_US/index.html#spring-data-features and in your particular case, keyset pagination is very helpful. Here a sample repository
@Repository
interface MainEntityRepository {
KeysetAwarePage<MainEntityDto> findAll(KeysetPageable pageable);
}
In your job code you could do something like the following
int pageSize = 100;
KeysetPageRequest pageable = new KeysetPageRequest(0, pageSize, null, Sort.asc("id"));
while (true) {
KeysetAwarePage<MainEntityDto> page = mainEntityRepository.findAll(pageable);
process(page);
if (page.size() < pageSize) {
break;
}
pageable = page.nextPageable();
}
It will only fetch the mappings that you tell it to fetch.
For more information about the underlying mechanism you can take a look at the documentation: https://persistence.blazebit.com/documentation/core/manual/en_US/index.html#anchor-keyset-pagination
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | Christian Beikov |
