How to build a Python backend?
Part 1: internal architecture
1/2

So you want to create a full-featured web application and you're wondering if you should use a large framework like Django or something more minimal like Flask? But what if you really need something in the middle? What if you want something simpler than Django because your frontend uses a technology like React or Angular? What if you need more than just a Web API like the one you can build with Flask because your apps handles complex business logic and/or interact with other systems asynchronously?

This article describes how we've built this kind of backend service with the following principles in mind:

  • Easy to maintain architecture
  • Ready-for-production
  • Taking full advantage of asyncio for the API, business logic, and interaction with third-party systems

Prerequisites: have a recent Python installed ( 3.9) with pip should be enough 😃

1. Domain Driven Design

First, let's talk about architecture!

There is a lot to learn from architecture design when you want to build real world applications. When you look at most of the code examples provided by Flask or FastAPI you get a very simple application with a REST API with only a simple handler per endpoint. In real applications you want to separate your business logic from the API calls so you can interact with the app via other canals such as GraphQL API, or RabbitMQ messages. You also need to deal with one or more storage systems, a database, a caching layer, an object storage service, a secret store, and more complex systems like cloud providers APIs, Kubernetes, etc.

To properly implement separation of concerns and abstract interactions with other systems, Domain Driven Design (DDD) concepts provide a nice toolbox to look into. The Architecture Patterns with Python Book (available online here), is a gold mine for understanding how to implement the DDD architecture in Python. It provides tons of step by step examples for every concept so you can understand why you should or shouldn't apply them. This is a must read, and most of what is presented here is based on this book.

So we'll walk you through an architecture composed of 3 layers: Domain, Application, and Infrastructure. The Domain layer defines the Data structures in plain Python objects: the business objects. The Application layer holds the brain of the App: the business logic. Finally, the Infrastructure layer is the "arms and legs" of our App: the part that interacts with the external world (HTTP API, database, file system, servomotors, etc).

So let's create our application's skeleton with the wonderful poetry:

mkdir myapp
cd myapp
pip install poetry
poetry init
mkdir -p myapp/application
mkdir myapp/domain
mkdir myapp/infrastructure

You should have something like:

├── myapp
│   ├── application
│   ├── domain
│   └── infrastructure
└── pyproject.toml

1.1 Domain

The Domain layer is a model representation of the services. It really is the core of our services and it must be able to evolve fast. This layer doesn't depends on any other layer (following the dependency inversion principle) and imports no external libraries (unless for justified exceptions, it only consists in raw python code).

A domain is a dataclass defining a business object. Most of the methods of these dataclasses consist of helpers manipulating the dataclass' state. Some of these classes are abstract classes, implemented by other classes from the infrastructure layer.

Methods of these classes can return Domain objects, states ("something went wrong", "no problem here", "only steps 1 and 3 worked"…), or nothing.

The general rule is to put as much stuff as possible there.

For example, here is an object that represents an entry in our todo app. And yes, our example will be a todo app! (as we all do ^^).

import uuid
from datetime import datetime
from dataclasses import dataclass, field


@dataclass
class TodoEntry:
    id: str
    created_at: datetime
    content: str
    tags: set[str] = field(default_factory=set)

    @classmethod
    def create_from_dict(cls, content:str) -> "TodoEntry":
        return cls(id=str(uuid.uuid4()), created_at=datetime.utcnow(), content=content)

    def set_tag(self, tag: str) -> None:
        self.tags.add(tag)

Did you notice that we heavily use Python types? This is really a good way to get something working quick and with confidence. We strongly advise you to use them and enforce it in the CI so you won't have surprises at execution time.

Want to speed up
backend development ?

1.2 Infrastructure

To manage all the interactions with external systems like database, file system, network, API, etc.

These services act as "wrappers" around external dependencies so that they can be used within the Application layer.

1.2.1 The repository pattern

This is also a place where we can find Repositories. The repository pattern is simply a class abstracting an object persistency. It provides at least add and get functions, providing a single way to store and retrieve data from storage systems. We can start with a Pickle file storage until we reach performance limitations signifying us to switch to an SQL database or something else. This process spares us having to change any line of code in our Application or Domain layer.

For example, here is a 'Todo entries' repository using the Pickle library to serialize objects into files:

import pickle
from dataclasses import dataclass
from pathlib import Path

from myapp.domain.todo import TodoEntry
from myapp.domain.todo_entry_repository import ITodoEntryRepository


class TodoEntryNotFound(Exception):
    pass


@dataclass
class TodoEntryPickleRepository(ITodoEntryRepository):
    storage_dir: str

    def get(self, entry_id: str) -> TodoEntry:
        try:
            entry: TodoEntry
            with open(Path(self.storage_dir) / entry_id) as entry_file:
                entry = pickle.load(entry_file)
            return entry
        except Exception:
            raise TodoEntryNotFound()

    def add(self, entry: TodoEntry) -> None:
        with open(Path(self.storage_dir) / entry.id) as entry_file:
            pickle.dump(entry, entry_file

Note that we implement an abstract class in the Domain layer. This allows us to import the repository interface from the Application layer without knowing what the actual implementation is.

1.3 Application

Now that we have the Domain that contains the business object as well as our Repository to manage persistence of this object in the Infrastructure layer, we need to glue them together with our business logic.
The Application layer contains all the services provided by the application, using the Domain structures and the Infrastructure as a backend.

These Application services "orchestrate" the Domain's structures and the Infrastructure services so that they work together harmoniously.

Application data should not be modified here ; it is the job of the classes' methods of the Domain layer. As mentioned before, no data is directly modified here. However, we catch exceptions and use object methods to apply the right business rules.

For example we can have a TodoService like this one:

from dataclasses import dataclass
from typing import Optional

from myapp.domain.todo import TodoEntry
from myapp.domain.todo_entry_repository import ITodoEntryRepository


@dataclass
class TodoService:
    todo_repository: ITodoEntryRepository

    def add_entry(self, content: str) -> str:
        entry = TodoEntry.create_from_content(content)
        self.todo_repository.add(entry)
        return entry.id

    def add_tag(self, entry_id: str, tag: str) -> None:
        entry = self.todo_repository.get(entry_id)
        entry.set_tag(tag)

    def get_all(self, search: Optional[str] = None) -> list[TodoEntry]:
        return self.todo_repository.get_all(search)

Wait! When was this todo_repository created and by who? It's now time to talk about Dependencies Injection.

1.4 Dependencies injection

The goal of dependency injection is to avoid creating objects everywhere or passing them in all functions in some kind of Context melting pot. To do so we'll define where all Infrastructure services are created, in one single place. We can then easily inject these services as dependencies of Application services using a default value as a singleton (e.g. for a database connection) or a one-time object from a factory (e.g. for an HTTP request handler).

The Dependency Injector library is well designed and provides everything you need to define all your services, inject them and even load configurations.

Let's install it:

poetry add dependency_injector

Explain configuration and dependency in the container.py code:

from dependency_injector import providers, containers

from myapp.application.todo_service import TodoService
from myapp.infrastructure.database.todo_entry_repository import TodoEntryPickleRepository


class ApplicationContainer(containers.DeclarativeContainer):
    configuration = providers.Configuration()

    todo_entry_repository = providers.Singleton(
        TodoEntryPickleRepository,
        storage_dir=configuration.storage_dir
    )

    todo_service = providers.Factory(
        TodoService,
        todo_entry_repository
    )

2. Web API

Now that we have our base application, we need to create an API. FastAPI is a really nice library that helps you create your API endpoints, route them, serialize and deserialize the API objects (called models), and even generate interactive documentation pages.

Let's add it to our dependencies with:

poetry add fastapi

A proper way to add your API is to separate them by controllers, one by group of endpoints. So let's create a centralized setup file to aggregate common configuration and dependency injection for all controllers in infrastructure/api/setup.py.

from fastapi import FastAPI

from myapp.container import ApplicationContainer
from myapp.infrastructure.api import todo_controller


def setup(app: FastAPI, container: ApplicationContainer) -> None:

    # Add other controllers here
    app.include_router(todo_controller.router)

    # Inject dependencies
    container.wire(
        modules=[
            todo_controller,
        ]
    )

And the controller for the /todo endpoints:

from dataclasses import asdict
from typing import Optional

from dependency_injector.wiring import Provide
from fastapi import APIRouter

from myapp.application.todo_service import TodoService
from myapp.container import ApplicationContainer
from myapp.infrastructure.api.todo_schema import TodoEntrySchema

todo_service: TodoService = Provide[ApplicationContainer.todo_service]

router = APIRouter(
    prefix="/todo",
    tags=["Todo"],
    responses={404: {"description": "Not found"}},
)


@router.get("/", response_model=list[TodoEntrySchema])
async def list_todos(search: Optional[str] = None) -> list[TodoEntrySchema]:
    todo_entries = todo_service.get_all(search)
    return [TodoEntrySchema(**asdict(todo_entry)) for todo_entry in todo_entries]

@router.post("/")
async def add_todo(content: str) -> str:
    return todo_service.add_entry(content)

Here is the todo schema used for the serialization. Note that we use a different object than the internal TodoEntry from the domain because we want to decorellate it from the external API. Thus, you can change your API wordings and hide internal values that are not useful for users. The schema is based on the Pydantic model that uses Python built-in typing. As advertised in the FastAPI documentation, it comes with plenty of advantages like static analysis with Mypy, useful IDE autocomplete, easy debugging and so on.

from pydantic import BaseModel


class TodoEntrySchema(BaseModel):
    id: str
    content: str
    tags: list[str]

3. Test it!

Note that for the sake of simplicity, we kept the testing part out of the way so far... shame on us! This is one of the main reasons we split all of our code this way! Keep in mind that all the components can be easily tested both separately or in-context. Let me give you an example. Our API calls the application service which then calls the repository and then returns a todo list converted from the domain object.

Here is a simple test for our repository in myapp/infrastructure/database/test_todo_entry_repository.py:

...

>> Next page >>

Michael Mercier

Lead Software Engineer at Ryax Technologies. Michael is a Computer Science PhD and R&D engineer, with a wide IT infrastructure expertise in multiple contexts: Cloud, High Performance Computing, and Big Data.