The N+1 problem is a common database performance issue. It plagues ORM’s, such as Django and SQLAlchemy, because it leads to your application making more database queries than necessary.

Let’s look at a basic example in Django.

class Artist(models.Model):
    name = models.CharField(max_length=255)

class Song(models.Model):
    artist = models.ForeignKey(Artist, on_delete=models.CASCADE)
    name = models.CharField(max_length=255)

def print_songs():
    for song in Song.objects.all():
        print(f"{song.artist.name} - {song.name}")

Now let’s create a unit test to ensure print_songs runs successfully. We’re using pytest and pytest-django so we can create the test data using a pytest fixture. When we run the unit test, you’ll notice it passes and our songs are printed to the terminal. Everything seems fine on the surface.

@pytest.fixture()
def make_data():
    artist = Artist.objects.create(name="Foo Fighters")
    for i in range(100):
        Song.objects.create(artist=artist, name=f"Song {i + 1}")

@pytest.mark.django_db()  # permits db access
def test_print_songs(make_data):
    print_songs()

# Foo Fighters - Song 1
# Foo Fighters - Song 2
# Foo Fighters - Song 3
# ...

However, if we count the number of database queries, we’ll see a total of 101. Yikes!

Note, connection.queries is only available if DEBUG=True. We can set this globally for all tests by setting django_debug_mode = true in our pytest.ini configuration file (docs).

from django.db import connection

@pytest.mark.django_db()
def test_print_songs(make_data):
    print_songs()
    print(len(connection.queries))

# ...
# 101

Why are we making so many queries? We make a query to fetch all Song objects in the database. Then, for every song, we make a query to fetch its Artist object.

Django’s select_related and prefetch_related queryset methods can easily fix this. Song to Artist is a Many-to-One relationship so we use select_related.

def print_songs():
    for song in Song.objects.select_related("artist"):
        print(f"{song.artist.name} - {song.name}")

If we run the unit test against our refactored function, we’re down to a single database hit. Nice!

@pytest.mark.django_db()
def test_print_songs(make_data):
    print_songs()
    print(len(connection.queries))

# ...
# 1

Now, how can we write our unit test to catch this blunder? I’ve traditionally used two strategies.

Assert query count

We can assert the number of db hits at the end of the unit test.

@pytest.mark.django_db()
def test_print_songs(make_data):
    print_songs()
    assert len(connection.queries) == 1

This approach works but doesn’t scale well. Manually counting the expected number of queries and adding the assertion to every unit test is unreasonable.

nplusone

The nplusone package automatically finds N+1 violations for you. It comes with a generic profiler which raises an NPlusOneError when a violation is discovered. We can create a fixture in our conftest.py file to wrap all of our tests in this profiler.

# conftest.py
import pytest
from nplusone.core.profiler import Profiler

@pytest.fixture(autouse=True)
def _raise_nplusone(request):
    if request.node.get_closest_marker("skip_nplusone"):
        yield
    else:
        with Profiler():
            yield

Here’s the exception message for our old version of print_songs that did NOT use select_related.

nplusone.core.exceptions.NPlusOneError: Potential n+1 query detected on Song.artist

There may be cases where you want to opt-out of the N+1 check. For example, the package may report a false positive or you don’t want to fix a specific violation. That’s the purpose of the skip_nplusone marker.

May your database access be optimal.