'How might I join two unrelated Django models via distance?

I'm working on a Django project that requires me to report, via web service, for each item in the School table all Hospitals that just so happen to be a set distance away. Here is a dummy implementation of the two models:

class School(models.Model):
    location = models.PointField(srid=4326, geography=True)
    ...some other fields...
    objects = models.GeoManager()

class Hospital
    location = models.PointField(srid=4326, geography=True)
    ...some other fields...
    objects = models.GeoManager()

I will state for the record that this is a contrived example, but the point remains that:

  1. Neither model has a foreign key relationship with the other (nor should they)
  2. The only relationship is via distance

Now, I am using Django Rest Framework (DRF) to render my requests, and I am using the Serializer classes that it provides to do something along these lines:

class SchoolSerializer(serializers.ModelSerializer):

    nearby_hospitals = serializers.SerializerMethodField('get_nearby_hospitals')

    class Meta:
        model = School
        fields = ('location', 'nearby_hospitals',)

    def get_nearby_hospitals(self, obj):

        geom = obj.location

        try:

            locations = Hospitals.objects.filter(loc__dwithin=(geom, 10000))
            return HospitalSerializer(locations, many=True).data
        except:
            return

This works but it is not efficient. In essence what happens is that DRF loads all Schools, and then from there loops over each school and runs the query in get_nearby_hospitals. The number of database queries is equal to the number of items in Schools plus one (to get the Schools).

Ideally I'd like a Django solution that performs something along these lines (obviously with the fields listed out, aliases for clashing ID column names, etc):

SELECT * FROM schools JOIN 
hospitals ON ST_DWithin(schools.location, hospitals.location, 10000)

The above query produces the intersection of all Schools and Hospitals within the proper distance. I could merge these results in manually with the School.objects.all() queryset or write a better query and call the raw QuerySet method to get what I'm after in one shot.

Is there a better or more 'Django Way' solution to this problem?



Solution 1:[1]

I would recommend using GraphQL/DataLoaders to solve this N+1 problem as DRF is not well suited for it. However, this can be done in DRF by using a custom ListSerializer (django creates one for you behind the scenes automatically) and overloading the to_representation functions of both List/Child serializers:

class SchoolListSerializer(serializers.ListSerializer):
    def to_representation(self, data):
        iterable = data.all() if isinstance(data, models.Manager) else data
        keys = [(item.id, item.location) for item in iterable]
        # perform your bulk nearby query here using the id/locations
        # the result should be a dict with school id keys and list of serialized hospital values
        # {1: [hospital.to_dict(), hospital.to_dict(), ...], ...}
        nearby_by_id = ...
        extra = {
            'nearby_hospitals_by_id': nearby_by_id
        }
        return [
            self.child.to_representation(item, **extra) for item in iterable
        ]

class SchoolSerializer(serializers.ModelSerializer):

    class Meta:
        model = School
        fields = ('location', )
        list_serializer_class = SchoolListSerializer

    def to_representation(self, instance, **extra):
        response = super().to_representation(instance)
        response['nearby_hospitals'] = extra['nearby_hospitals_by_id'][instance.id]
        return response

I left out the actual DB query, however the example will work with any queries regardless of where the data resides.

Solution 2:[2]

I'm not sure it's better, but I think you are taking the problem in the reverse direction. I see no reason for any JOIN at all.

class SchoolSerializer(serializers.ModelSerializer):

    nearby_hospitals = serializers.SerializerMethodField('get_nearby_hospitals')

    class Meta:
        model = School
        fields = ('location', 'nearby_hospitals',)

    def get_nearby_hospitals(self, obj):

        geom = obj.location

        try:
            # why do you query School model if you want nearby hospitals?
            locations = Hospital.objects.filter(location__distance_lte=(geom ,10000))
            return locations # not sure it's the good type to return though
        except:
            return

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Bemis
Solution 2 Benjamin Toueg