How to build a Python backend?
(Part 1: internal architecture) 2/2

3. Test it!

Note that for the sake of simplicity, we kept the testing part out of the way so far... shame on us! This is one of the main reasons we split all of our code this way! Keep in mind that all the components can be easily tested both separately or in-context. Let me give you an example. Our API calls the application service which then calls the repository and then returns a todo list converted from the domain object.

Here is a simple test for our repository in myapp/infrastructure/database/test_todo_entry_repository.py:

from tempfile import TemporaryDirectory

import pytest as pytest

from myapp.container import ApplicationContainer
from myapp.domain.todo import TodoEntry
from myapp.infrastructure.database.todo_entry_repository import TodoEntryPickleRepository


@pytest.fixture()
def repository():
    with TemporaryDirectory() as tmp_dir:
        container = ApplicationContainer()

        container.configuration.storage_dir.from_value(tmp_dir)
        yield container.todo_entry_repository()


def test_add_and_get(repository: TodoEntryPickleRepository):
    entry = TodoEntry.create_from_content("test")
    repository.add(entry)
    assert entry == repository.get(entry.id)

If you don't know Pytest Fixture, it's a generator that can be used as an input of a test function. It's very useful if you have context to manage. Here, our repository uses a local directory to store our objects, so we have to create the directory before the test, and delete it with all its content even if the test fails. To do so, we use tempfile.TemporaryDirectory providing context about where the directory is created when we enter it, and wipe it when we leave it. With our fixture, every function that uses the repository input now has a fresh storage directory sent with yield, and only when the test finishes the fixture does the code continue to be executed. Eventually the context is closed and the directory is deleted.

Let's install pytest in the test environment (so it doesn't end up in our final package) and run it. With poetry you can run poetry shell to get a new shell with a virtual environment activated, where all your dependencies are available including tools like pytest:

poetry add --dev pytest
poetry shell
pytest

With this test, we find out that the repository does not open the files in "write in bytes" mode but in "read in utf-8" mode which is the default in Python. So I've added the mode="wb" in my add function and the mode="rb" in my get function.

4. Check them all!

A good practice when you're coding something is to do some formatting and static analysis of you code before committing it to your Git repository. It is called linting. You can apply format, sort the imports, check for inconsistencies and errors in the code, and even check for security issues or dead code.

Here is a set of tools we use for that, all packed in a simple lint.sh script that we'll run before committing (you'll need to use slightly different options to run it in the CI, but that's another story).

echo "-- Checking import sorting"
isort .

echo "-- Checking python formating"
black .

echo "-- Checking python with static checking"
flake8

echo "-- Checking type annotations"
mypy ./myapp  --ignore-missing-imports

echo "-- Checking for dead code"
vulture ./myapp

echo "-- Checking security issues"
bandit -r ./myapp

To install all these, use poetry add --dev ... and poetry shell just like you did for pytest.

Here is an example of the outputs of this script from mypy:

myapp/infrastructure/database/todo_entry_repository.py:36: error: Unsupported operand types for in ("Optional[str]" and "str")

Here is the code:

def get_all(self, search: Optional[str]) -> list[TodoEntry]:
    entries: list[TodoEntry] = []
    for entry_file_path in Path(self.storage_dir).iterdir():
        with open(entry_file_path, mode="rb") as entry_file:
            entry: TodoEntry = pickle.load(entry_file)
            if search in entry.content or search in entry.tags:
                entries.append(entry)
    return entries

Humm, we have an optional search option here, but we never checked if it is "None" before using the in operator on it! It means that we have to manage the "None" case, so let's replace our "if" with:

if search:
    if search in entry.content or search in entry.tags:
        entries.append(entry)
else:
    entries.append(entry)

And now mypy is happy!

❯ mypy ./myapp --ignore-missing-imports
Success: no issues found in 16 source files

This is only an example of what static checks can bring you. Use them extensively and you'll drastically reduce the amount of bugs in production!

5. Application main

Now that we have a full application, we need to glue all this together in some main application function. So we'll create an app.py file at the root of our source code, loading the configuration and launching the app's main process.

import logging

from fastapi import FastAPI
import uvicorn

from myapp.container import ApplicationContainer
from myapp.infrastructure.api.setup import setup
from myapp import __version__


def init() -> FastAPI:
    container = ApplicationContainer()

    # Setup logging
    container.configuration.log_level.from_env("TODO_APP_LOG_LEVEL", "INFO")

    str_level = container.configuration.log_level()
    numeric_level = getattr(logging, str_level.upper(), None)
    if not isinstance(numeric_level, int):
        raise ValueError("Invalid log level: %s" % str_level)
    logging.basicConfig(level=numeric_level)
    logger = logging.getLogger(__name__)
    logger.info("Logging level is set to %s" % str_level.upper())

    # init Database
    container.configuration.storage_dir.from_env("TODOAPP_STORAGE_DIR", "/tmp/todoapp")
    Path(container.configuration.storage_dir()).mkdir(parents=True, exist_ok=True)

    # Init API and attach the container
    app = FastAPI()
    app.extra["container"] = container

    # Do setup and dependencies wiring
    setup(app, container)

    # TODO add other initialization here

    return app


def start() -> None:
    """Start application"""
    logger = logging.getLogger(__name__)
    logger.info(f"My TODO app version: {__version__}")
    app = init()
    uvicorn.run(
        app, host="0.0.0.0", port=8080,
    )


if __name__ == "__main__":
    start()

You can see that we finally defined how to load the configuration, in this case using environment variables but you can use any common configuration format supported by the configuration provider like Yaml or Ini.

6. Run it!

At last we can run our application! We're now confident it's ready to see some real user requests

python -m myapp.app

And in another terminal:

❯ curl localhost:8080/todo/
[]
❯ curl -H "Content-Type: application/json" -d '{"content": "my first entry"}' localhost:8080/todo/
"8b122755-530e-43c7-ae84-362c17a37fc5"
❯ curl localhost:8080/todo/
[{"id":"8b122755-530e-43c7-ae84-362c17a37fc5","content":"my first entry","tags":[]}]

It works! But the best part is the OpenAPI interactive documentation available at: http://127.0.0.1:8080/docs
Here you can see the documentation and query the backend directly

7. What's next?

We still have a lot to show you. How to use Sqlalquemy ORM and Alembic to automate database migration, how to add an internal Message Bus to manage synchronous and asynchronous treatment properly, how to do application testing with Mocks, etc...

But this post is way tool long already...

Full source code for the example we discussed in this post can be found here: https://github.com/RyaxTech/example-app

The original idea and first implementation of this architecture was made by Maxime Arriaza at Ryax Technologies, and we really thank him for that!

In a future article we'll cover another level of the architecture, with a discussion on the Microservices. How to split your services? Who is responsible of what? How to manage inter-services communication?

Let's stay in touch!

Michael Mercier

Lead Software Engineer at Ryax Technologies. Michael is a Computer Science PhD and R&D engineer, with a wide IT infrastructure expertise in multiple contexts: Cloud, High Performance Computing, and Big Data.