Steering Django Towards an Async World

June 05, 2020 by Susmitha Gudapati
Steering Django Towards an Async World

When Django started making the rounds with its newest upgrade recently, a buzzword that might have reached you would be 'ASGI'. Being tech-savvy, and living with WSGI servers all the time, you were probably curious to explore what it is, and everything in its arsenal.

With the same curiosity, I delved into Django’s source code and found some fascinating stuff. But before we dig deeper into these, let’s first see why Django has chosen to go async after all these years. Here are a few reasons:

  • Django 2.1 was the first version that supported Python 3.5 and up, and earlier versions of Python. That means 2.x couldn’t support async-native code.
  • The dependency on high concurrency workloads and large parallelizable queries are becoming the crux of web applications today.

The Motivation Behind Adding Async to Django

Asynchronous programming has emerged as the state-of-the-art mechanism over time, given the requirements of the software world. However, Python's threading structure has remained inefficient since its first async support release using the asyncio library in version 3.4. Keeping this core flaw and the modern world necessities in mind, Django's core contributor, Andrew Godwin developed an async framework using WebSockets called django-channels. This brought in some async support and led to the envisioning of async support for Django.

Adding async support to Django is meant to make it performant, and allow users to run tasks concurrently by overcoming inefficient threading. Because Django has been synchronous for a long time now, its community couldn't unravel its capabilities in asynchronous programming. So, in addition to improving its performance, it also aims to unlock capabilities for the community in the async world!

Phase 1 of Async Django: Introducing ASGI

The transition to async Django has been planned to release in 3 phases. The first phase was out in Dec 2019 and the main objective of this phase was to make Django communicate with asynchronous web servers. ASGI is an asynchronous sibling to WSGI - it includes everything that WSGI does and extends its support to long-poll HTTP and WebSocket connections.

asgi

The new ASGI server includes ASGIHandler which represents an application object responsible for taking input events and sending the output back - much like WSGIHandler. It also comes with a separate request class - ASGIRequest - as a lot of the logic that maps things in Django's meta is executed here. Once the request is initialised, the get_response is called, which stays common between ASGI and WSGI. This ensures that most of Django's logic remains the same, but the initialisation of the request is asynchronous. This is what the basic application looks like:

WSGI

def application(environ, start_response):
    start_response("200 OK", [("Content-Type", "text/plain")])
    return b"Hello, World"

ASGI

async def application(scope, receive, send):
    await send({"type": "http.response.start",
      "status": 200, "headers": [(b"Content-Type", "text/plain")]})
    await send({"type": "http.response.body", "body": b"Hello World"})

Phase 2 of Async Django: Views & Middleware

The goal of this phase is to make Django handle middlewares, views in HTTP, and request objects asynchronously. Synchronous views are the cornerstone of Django, and pretty much everything is implemented in this layer. There are forms, templates, database calls, and all of the business logic being executed here. So adding async support to views is believed to be the trickiest and the most meticulous change.

Views

async-views

While ASGI ensures that request initialization is asynchronous, this phase corroborates asynchronous request processing and response returning. The BaseHandler - common to WSGI and ASGI - is asynchronous, and it is responsible for handling both sync and async views. It's pretty straightforward for a request to access an async view from BaseHandler because the environment remains asynchronous. But to access a sync view, the request has to be converted from async to sync using an adapter called async_to_sync from the asgiref library.

Declaring an async view is simple. Define a view by adding the keyword async right before the view declaration. For function-based views, async def would suffice. To achieve the same on class-based views, declare the __call__ method as async. These callable functions, on being declared async, return coroutines to bolster asynchronous programming.

Middlewares

async-middleware

Now that Django supports both sync and async middlewares, the transition between them becomes very critical. We need to be cautious at the time of context-switching in order to avoid deadlocks and indefinite loops. To handle this, Django relies on adapter functions, namely — sync_to_async and async_to_sync. We will explore these further as we read on.

Phase 3 of Async Django: ORM, Caching & More

When a queryset is triggered in Django, it first builds a basic query object which is compiled into SQL by the compiler, and then the connection is established with the database. This is how a basic queryset is executed in Django.

django-orm

Now, the idea is to write an asynchronous API for this queryset and make sure that database calls and transactions work safely in the synchronous environment. Once this async queryset API is established, the plan is to tackle the rest of the flow as this would require handling the entire database backend in async. This is undoubtedly the biggest overhaul of this project.

Besides making the ORM completely async capable, the goal is to extend async support to more elements of Django in the future, including templates, a few operations of cache, test clients, sessions, authentication, admin and signals. You may wonder why forms aren't considered for async support even though they play an important role during application development. The main reason is that forms are mostly CPU backed. It is believed that there isn't much advantage in making them async capable.


Now, let's delve into async adapter functions and understand a bit more about them.

Async Adapter Functions

The two core functions in async Django are sync_to_async & async_to_sync. These functions come in handy when transitioning between the calling styles -  async to sync or sync to async - while preserving compatibility. Adapter functions make sure they catch exceptions and raise them in the other context. For example, if an asynchronous code raises an exception, synchronous code catches it and vice versa. These are extensively used in Django and are imported from the asgiref.sync module. The asgiref package has always been part of Django and is automatically installed as a dependency when Django is installed via pip.

async_to_sync:

As the name suggests it takes an async function and returns a sync function.

from asgiref.sync import async_to_sync

sync_function = async_to_sync(async_function)

@async_to_sync
async def async_function(...):
    ...

sync_to_async:

This turns a synchronous callable into an awaitable that runs in a threadpool (asynchronous).

from asgiref.sync import sync_to_async

async_function = sync_to_async(sync_function)
async_function = sync_to_async(sensitive_sync_function, thread_sensitive=True)

@sync_to_async
def sync_function(...):
    ...

@sync_to_async(thread_sensitive=True)
def sensitive_sync_function(...):
    ...

Both of these can be used as either a direct wrapper or a decorator. If you look at the source code, you will realize these are basically classes in the sync module turned into functions to bring in ease of use. This is just a basic introduction to adapter functions and how we use it in Django. You can go through the asgiref library to know more about how threads, event loops, and contextvars are invoked and managed.

Conclusion

Async views can still run under both ASGI and WSGI. But there will be performance penalties due to context-switching if they are executed under WSGI rather than an entirely async-enabled request stack and ASGI. Moreover, the other benefits that sync views support - like slow streaming, long-polling, etc. - are only achieved when Django is deployed using ASGI.

Similarly for middlewares, though both sync and async are supported, context switching between them results in a performance penalty. Consider a case with an ASGI server, a synchronous middleware, and asynchronous views - Django has to switch to sync mode for middleware and back to an async mode for views. Meanwhile, it also has to hold the synchronous thread open for exceptions raised by the middleware.

So, though you have the liberty to use both sync and async features under WSGI, it is always a better idea to enable ASGI mode if there is any asynchronous code in the application.

django3asynchronous programmingpython3threadingperformance engineering

Got a project to discuss?