'Group entries based on associated model fields in aggregate

I have two models, Keyword and Company, associated using has_and_belongs_to_many.

On the Keyword model, there is a boolean field included. If this is set to true, any Company associated with that Keyword should be considered included, unless it has any associated Keywords with included set to false. included can also be set to nil which I consider a state of "empty".

To summarize:

  • Company A has 2 associated Keywords, 1 included = true and 1 included = nil. This company is INCLUDED.
  • Company B has 2 associated Keywords, 1 included = false and 1 included = true. This company is EXCLUDED.
  • Company C has 2 associated Keywords, both included = nil. This company is EMPTY.

What is the best way to 1) count the number of Included, Excluded, and Empty companies, and 2) query/scope the Company model to Included, Excluded, or Empty?

The current solution I have hacked together is causing expensive queries that are resulting (usually) in request timeouts. Models follow:

class Company < ApplicationRecord
  has_and_belongs_to_many :keywords

  scope :included, -> { joins(:keywords).merge(Keyword.included).group('id').reorder('') }
  scope :excluded, -> { joins(:keywords).merge(Keyword.excluded).group('id').reorder('') }
  scope    :empty, -> { joins(:keywords).merge(Keyword.empty).group('id').reorder('') }
end
class Keyword < ApplicationRecord
  has_and_belongs_to_many :companies

  scope :excluded, -> { where(included: false) }
  scope :included, -> { where(included: true) }
  scope    :empty, -> { where(included: nil) }
end

(reorder is included in Company model to resolve a quirk with pg_search gem and grouping)



Solution 1:[1]

I´m always trying to avoid joins because they are creating duplicates that you´d have to remove with a group('id').

I´d rather use EXISTS.

class Company < ApplicationRecord
  has_and_belongs_to_many :keywords

  scope :included, -> do
    where(
      Keyword
        .included
        .joins("INNER JOIN companies_keywords ON companies_keywords.keyword_id = keywords.id AND company_id = companies.id")
        .arel.exists
    ).where.not(
      Keyword
        .excluded
        .joins("INNER JOIN companies_keywords ON companies_keywords.keyword_id = keywords.id AND company_id = companies.id")
        .arel.exists
    )
  end

  scope :excluded, -> do
    where(
      Keyword
        .excluded
        .joins("INNER JOIN companies_keywords ON companies_keywords.keyword_id = keywords.id AND company_id = companies.id")
        .arel.exists
    )
  end

  scope :empty, -> do
    where.not(
      Keyword
        .joins("INNER JOIN companies_keywords ON companies_keywords.keyword_id = keywords.id AND company_id = companies.id")
        .where(included: [true, false])
        .arel.exists
    )
  end
end

Also i recently discovered this Gem which simplifies exists queries. I find that particularly helpful. You could just write:

scope :included, -> { where_exists(:keywords, &:included).where_not_exists(:keywords, &:excluded) }

I didn´t test the performance on these but it should be pretty solid (as long as you added the right indexes).

Hope this answers your questions.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1