'Need recommendations for Elasticsearch document structure

We are developing an app which provides users to create and share private contents (Topics). We want to make the topics and its inner content (TopicPosts) searchable and we also want to paginate the results with search. We decided to use Elasticsearch for this purpose and we are looking for some recommendations on the design. The relations between our SQL tables like below

  1. Topics -> {topicId: number, topicTitle: string}

  2. TopicUsers -> {userId: number, topicId: FK from Topics, permission: edit | view | owner}

  3. TopicPosts -> {postId: number, topicId: FK from Topics, postTitle: string, content: string}

The most complex search operation in the app:

  • If user searches for "Apple" then find the "Apple" word in Topics.topicTitle or TopicPosts.postTitle or TopicPost.content fields and return the results for each Topic with maximum five TopicPosts

Our current document model for Elasticsearch (represents Topics object actually)

{
  topicId: number,
  topicTitle: string,
  topicPosts: TopicPosts[], // contains lots of posts
  topicUsers: TopicUsers[], // contains few users
}

The above model works for the given search case using inner_hits query but we realized that storing the TopicPosts inside the Topics document is not a good idea since:

  1. We need TopicPosts object alone. We search and paginate it in other parts of the app

  2. You have to use inner_hits queries with nested types.

  3. Nested types has some problems with pagination (index.max_inner_result_window)

  4. Makes the documents too big.

I will be glad if you share your recommendations about the design.

Thanks



Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source