Database – Dan O'Sullivan's blog

Async PostgreSQL with FastAPI dependency injection & SQLAlchemy

Posted on 28th October 2022 (28th October 2022) by Dan

Hello!

It’s a clear-skied, crisp day here in Minnesota. Winter is coming, but not today.

While we wait for the world to turn and the seasons to change, why not pass some time thinking about database configuration?

I recently moved from SQLite to PostgreSQL as the database for pythondocs.xyz, my project that tries to bring a Google-like experience to Python’s official documentation. (The main motivation was to improve search results with Postgres’ full-text search capabilities.)

Swapping the DB engines proved to be remarkably straightforward: the SQLAlchemy ORM abstracted away dialect differences, and I was already using an abstract Database class with FastAPI, so I just needed to write another implementation.

Keep reading for a walkthrough of the code…

Pre-requisites

You need Postgres running on your machine or in a Docker container.

I’m using Docker Compose for this project. I won’t explain more today but I’m happy to make it the subject of a future blog post if people are interested?

You also need these Python packages:

uvicorn (web server), python-dotenv (settings loader), fastapi (web framework), sqlalchemy (ORM) , asyncpg (database driver)

I’m presenting this example as a single Python file so that it’s easy to copy and understand. But I’ll also mention original file paths, as you’ll definitely want more organization in a real project.

Here are the imports for the example:

import os
from abc import ABC, abstractmethod
from typing import AsyncIterator, Optional

import uvicorn
from dotenv import load_dotenv
from fastapi import Depends, FastAPI
from fastapi.responses import JSONResponse
from sqlalchemy import select
from sqlalchemy.ext.asyncio import AsyncSession, create_async_engine
from sqlalchemy.orm import sessionmaker

# not included in example
from app.models import Folder

Config elation

The config module provides production or development settings, depending on environment variable. It also loads the Postgres password from an .env file. The idea is that you add .env to your .gitignore file so that you don’t mix secrets with the rest of your code.

# config.py
load_dotenv()


class Config(ABC):
    POSTGRES_USERNAME = "postgres"
    POSTGRES_DB_NAME = "postgres"
    # localhost for development purposes
    POSTGRES_HOST = "localhost"
    POSTGRES_PORT = "5432"
    # password stored in .env file
    POSTGRES_PASSWORD = os.getenv("POSTGRES_PASSWORD")
    SQL_COMMAND_ECHO = False


class DevelopmentConfig(Config):
    SQL_COMMAND_ECHO = True


class ProductionConfig(Config):
    # hostname in Docker network for production
    POSTGRES_HOST = "db"


def get_config() -> Config:
    env = os.getenv("ENV")
    if env == "development":
        return DevelopmentConfig()
    return ProductionConfig()


config = get_config()

Data based

Here, we define an abstract Database class, with a __call__() method that works with FastAPI’s dependency injection. Its setup() method is provided in concrete implementations, like PostgresDatabase.

# database/base.py
class Database(ABC):
    def __init__(self):
        self.async_sessionmaker: Optional[sessionmaker] = None

    async def __call__(self) -> AsyncIterator[AsyncSession]:
        """For use with FastAPI Depends"""
        if not self.async_sessionmaker:
            raise ValueError("async_sessionmaker not available. Run setup() first.")
        async with self.async_sessionmaker() as session:
            yield session

    @abstractmethod
    def setup(self) -> None:
        ...


# database/postgres.py
def get_connection_string(driver: str = "asyncpg") -> str:
    return f"postgresql+{driver}://{config.POSTGRES_USERNAME}:{config.POSTGRES_PASSWORD}@{config.POSTGRES_HOST}:{config.POSTGRES_PORT}/{config.POSTGRES_DB_NAME}"


class PostgresDatabase(Database):
    def setup(self) -> None:
        async_engine = create_async_engine(
            get_connection_string(),
            echo=config.SQL_COMMAND_ECHO,
        )
        self.async_sessionmaker = sessionmaker(async_engine, class_=AsyncSession)

Stitch it all together

There are a few things going on here:

An instance of PostgresDatabase is created in the depends module.
The fast_api object is created in main.
db.setup() is run as a FastAPI startup event only. This means that all code files can be safely imported without side effects.
A route is definied which performs a simple DB query using FastAPI’s Depends.
Finally, the uvicorn web server is started.

# depends.py
db = PostgresDatabase()

# main.py
fast_api = FastAPI()


@fast_api.on_event("startup")
async def setup_db() -> None:
    db.setup()


# routes.py
@fast_api.get("/example/")
async def db_query_example(
    session: AsyncSession = Depends(db),
) -> JSONResponse:
    results = await session.execute(select(Folder))
    return results.all()


# run_uvicorn.py
if __name__ == "__main__":
    uvicorn.run(
        "fastapi_postgres_async_example:fast_api",
    )

That’s it!

If the computer Gods smile upon you, you now have a working – and pretty fast – database configuration.

Here’s what my **/example/** route looks like. You will need to supply your own data.

Thanks for reading! I hope this has been useful to someone. Please let me know in the comments.

The full code is over here on GitHub.

Async in-memory SQLite/SQLAlchemy database for FastAPI

Posted on 27th June 2022 (28th October 2022) by Dan

Hello friends!

Today I’m presenting the database configuration that I (currently) use on pythondocs.xyz – real time interactive search of Python documentation.

It copies a SQLite database from disk into memory, so it’s very fast. It’s great for read-only workflows – dashboards and the like. It’s not suitable for sites that accept user input, as it makes no attempt to preserve updates to the database.

The config works well for pythondocs.xyz: I generate the site’s database “offline”, with a standalone parser application, and I ship the resulting database file with the web application. When the web app starts up, the database is copied into memory, and you get nice fast database access (even if your queries aren’t super efficient!)

The main dependencies are sqlalchemy, the predominant Python ORM, and aiosqlite, an async replacement for the Standard Library’s sqlite3. I use the database with FastAPI but it should work in other applications.

The database copying is handled by sqlite3‘s backup method. But sqlite3 is a synchronous library, and I want concurrent database access for performance reasons. Luckily, it’s possible to populate the database with sqlite3 and read it from aiosqlite by pointing the two libraries at the same shared memory location.

Without further ado, here’s the code that sets up the database:

from typing import Optional

from sqlalchemy.engine import Engine, create_engine
from sqlalchemy.ext.asyncio import AsyncEngine, AsyncSession, create_async_engine
from sqlalchemy.orm import sessionmaker

SQLITE_SYNC_URL_PREFIX = "sqlite:///"
SQLITE_ASYNC_URL_PREFIX = "sqlite+aiosqlite:///"
MEMORY_LOCATION_START = "file:"
MEMORY_LOCATION_END = "?mode=memory&cache=shared&uri=true"


class InMemoryDatabase:
    """
    Async in-memory SQLite DB
    """

    def __init__(self, sql_echo: bool = False):
        self.sql_echo = sql_echo
        self._sync_memory_engine: Optional[Engine] = None
        self._async_memory_engine: Optional[AsyncEngine] = None
        self._async_sessionmaker: Optional[sessionmaker] = None

    def setup(self, filename: str):
        """
        Copy DB data from disk to memory and setup async session
        """
        sync_disk_engine = create_engine(
            url=SQLITE_SYNC_URL_PREFIX + filename, echo=self.sql_echo
        )
        in_memory_url = MEMORY_LOCATION_START + filename + MEMORY_LOCATION_END
        # Reference to sync in-memory engine remains open
        self._sync_memory_engine = create_engine(
            url=SQLITE_SYNC_URL_PREFIX + in_memory_url, echo=self.sql_echo
        )
        # Use sync engines to copy DB to memory
        backup_db(source_db=sync_disk_engine, target_db=self._sync_memory_engine)
        sync_disk_engine.dispose()
        # Create async engine at same memory location
        self._async_memory_engine = create_async_engine(
            url=SQLITE_ASYNC_URL_PREFIX + in_memory_url, echo=self.sql_echo
        )
        self._async_sessionmaker = sessionmaker(
            self._async_memory_engine, class_=AsyncSession
        )

Compatibility with FastAPI’s dependency injection is provided by this method:

    async def __call__(self) -> AsyncIterator[AsyncSession]:
        """Used by FastAPI Depends"""
        assert self._async_sessionmaker, "No sessionmaker. Run setup() first."
        async with self._async_sessionmaker() as session:
            yield session

(Thank you to the FastAPI Pagination project for inspiration!)

Use with FastAPI looks like this:

from fastapi import Depends, FastAPI
from sqlalchemy import select
from sqlalchemy.ext.asyncio import AsyncSession

from async_in_memory_db import InMemoryDatabase
from example_data import DB_FILENAME, User

app = FastAPI()
db = InMemoryDatabase()


@app.on_event("startup")
async def setup_db():
    db.setup(DB_FILENAME)


@app.get("/")
async def example_route(session: AsyncSession = Depends(db)) -> list[User]:
    results = await session.execute(select(User))
    return results.scalars().all()

And here’s what you get in your web browser:

Please see the python_async_in_memory_db GitHub repo for the full code, including an example standalone query that doesn’t use FastAPI.

Is this technique useful to you? Can you see any potential pitfalls that I’ve overlooked?

Let me know in the comments below!

Introducing pythondocs.xyz – live search for Python documentation

Posted on 18th May 2022 (28th October 2022) by Dan

Winter is long here.

It is so long that I did an accidental software development after False Spring 3 – whoops!

pythondocs.xyz demo

pythondocs.xyz

pythondocs.xyz is a web tool that provides live search results for Python’s official documentation.

Please try it out and let me know what you think!

It’s at “beta” stage, which means it works pretty well but it’s not perfect.

It’s fast and it looks good and the results are… fine.

It did, however, survive the front page of HackerNews without going above 2% CPU usage, which I think is pretty good.

The next big feature will be better search results. In particular: improved prominence of important language features, like built-in functions, and refined full text search and ordering of results.

Here’s the tech stack as it currently stands, for those interested:

Parser: Beautiful Soup + Mozilla Bleach
Database: in-memory SQLite (aiosqlite) + SQLAlchemy
Web server: FastAPI + Uvicorn + Jinja2
Front end: Tailwind CSS + htmx + Alpine.js

This is my first big FastAPI project, and over the next few weeks I’ll blog about some of the tricks I used, especially to do with performance.