In Django, a related object is a model instance used in the one-to-many or many-to-many context.
For example, let’s look at the built-in User
model, which has a many-to-many relationship with the Group
model.
class User(models.Model):
groups = models.ManyToManyField(Group, related_name="groups")
For any given User
object, all linked Group
objects are called “related objects”. Here are 5 ways to fetch all User
objects with at least one related Group
object.
Iterate over each object in Python
users = []
for user in User.objects.prefetch_related("groups"):
if user.groups.exists():
users.append(user)
Probably the most popular approach but there are two problems:
- Doesn’t return a queryset so we can’t do any more work in the database, which isn’t ideal for performance.
- Makes a total of two database queries because prefetch_related does a separate lookup for each relationship.
Let’s see some other query count examples with prefetch_related
:
User.objects.all() # 1 query
User.objects.prefetch_related("groups") # 2 queries
User.objects.prefetch_related("groups__permissions") # 3 queries
User.objects.prefetch_related("groups", "user_permissions") # 3 queries
All other methods in this list make only one database query.
isnull field lookup and distinct()
users = User.objects.filter(groups__isnull=False).distinct()
Be sure to remember to use distinct()
. Without it, each User
is duplicated n
times, where n
is the number of groups the user belongs to. Definitely a little hacky. I wouldn’t recommend using it in practice but it’s an interesting approach to understand.
queryset.annotate() and Count()
from django.db.models import Count
users = (
User.objects
.annotate(group_count=Count("groups"))
.filter(group_count__gt=0)
)
Readable and makes a single database query. The only con is it’s slightly slower than the next two solutions because it counts all groups instead of stopping at the first one.
queryset.annotate() and Exists() subquery
from django.db.models import Exists, OuterRef
users = (
User.objects
.annotate(has_group=Exists(Group.objects.filter(user=OuterRef("pk"))))
.filter(has_group=True)
)
Very fast. Exists()
will stop once a single record is found. Per the Django documentation:
Exists is a Subquery subclass that uses an SQL EXISTS statement. In many cases it will perform better than a subquery since the database is able to stop evaluation of the subquery when a first matching row is found.
Exists() subquery
users = (
User.objects
.filter(Exists(Group.objects.filter(user=OuterRef("pk"))))
)
Same solution as above but slightly better if we don’t want to return the has_group
field. This is called the conditional filter:
When a conditional expression returns a boolean value, it is possible to use it directly in filters. This means that it will not be added to the SELECT columns, but you can still use it to filter results
May your Django queries be clean and performant.