Part three of the joint project with my incredibly sharp wife, Olga!

This post is a part of a series, so check out the entire series for context.

In this post, we’ll dive into how we’ve implemented a Database in our application. The post starts with the design, then moves to implementation of DB connectivity, then the data models themselves, and finally some testing.

Upfront, we had to decide if we’ll choose a relational database or something else - and we decided to go relational. This could be a super interesting decisions with many factors going into it - but in reality, we chose relational because it’s to most relevant for us to practice right now :)

## Database Schema Design

Let up start by quickly analyzing what our database schema will be. Since we’ll probably change it moving forward anyway, this is just a rough sketch.

We’ll design the database using two tools and an iterative approach. The first tool is the Entity-Relationship diargram (also known as an ER model), which helps us visually describe the abstract data model. The data model will be implemented with a relational database. The second tool: reviewing the ER model by testing how well it can respond to queries required by the business logic.

### Entity-Relationship diargram

We sketched this diargram using my new favorite whiteboarding tool, Excalidraw. While there are many tools specifically designed for creating ER diargrams, I find them exhausting; just whiteboarding allows one to focus on what one cares about, instead of focusing on the tool itself.

As we mentioned before, one good way to review the ER model is to check whether it is useful for the main queries we’ll want to ask of it. These queries are derived from the main business needs from the application. So let’s check that!

#### On My Way

The main use case is someone telling the bot that they are on their way. The flow will look something like this:

1. User: I’m OMW! Here’s my Workspace ID and User ID.
2. Backend: OK. Let’s understand what Garage your headed to. Are there multiple Garages in this workspace? SELECT * FROM Garages WHERE workspace_id=workspace_id.
1. If there are multiple garages, check if the user has a preference: SELECT default garage FROM Users WHERE user_id=user_id
1. If there isn’t a preference, ask the user which garage they’re headed towards today.
2. Otherwise (there’s only one garage), just use that one.
3. Backend: What are the user’s parking preferences? SELECT Spot Requirements, Spot Preferences FROM Users WHERE user_id=user_id.
4. Backend: Are there relevant open spots? SELECT * FROM Spots INNER JOIN Reservations WHERE Garage ID=Garage ID AND Workspace ID=Workspace ID AND Reservations.date == Today AND Reservations.active == True. The backend should also filter the spots by the user’s spot requirements.
1. If there are open spots, reserve one! INSERT INTO Reservations (columns) VALUES (values). Then tell the user the arrival instructions.
2. If there aren’t any open spots, tell the user the fallback instructions.

#### Current Status

One other use case is someone asking what’s the current status of the garage. That flow is a lot simpler:

1. User: What’s the status of garage X?
2. Backend: SELECT * FROM Spots INNER JOIN Reservations WHERE Garage ID = X AND Reservations.date == Today.

#### Release a Spot

1. User: Release my spot!
2. Backend: SELECT * FROM Reservations WHERE Date == Today AND User ID = User ID
1. If there’s only one reservation, update with Active = False.
2. If there are no reservations, return an error.
3. If there are multiple reservations, ask the user which one to release.

#### Analytics

These queries will be useful for reports.

1. User: What’s my parking history?
2. Backend: SELECT * from Reservations WHERE User ID = User ID AND workspace_id == workspace_id.
3. User: Which spot did I park in the most?
4. Backend: Same query is before, just with some ORDER BY and count() sprinkled on top.

### Summary

Seems like the schema structure is OK.

There is something “uncomfortable” about often having to perform an INNER JOIN between spots are reservations. We could work around that in various ways, like a database view or duplication of data - but I think that the schema is good enough to move forward!

## Choosing a DB and an ORM (Spoiler: PostgreSQL + SQLAlchemy)

There are quite a lot of Database and ORM options out there. We decided to go with a relational database (specifically, PostgreSQL) because it’s the most relevant for us to practice. Also, it’s the fourth most popular option, behind three other relational options, so how wrong could this choice be?

Now, which ORM to use? We decided to go with SQLAlchemy as an ORM since quick research showed it’s definitely the standard for Python, very popular with close to 50M monthly downloads, and documented enjoyably:

Other options considered were:

• peewee; loses on popularity as it has only 500K monthly downloads.
• Django ORM; loses since its author hates it, appearantly.
• PonyORM; loses on various parameters, which you can check out on libhunt - but again, mostly due to popularity.

## Developing the DB session

To make using the DB in development easy, and switching to prod also easy, we’ve used the factory design pattern to create the DB session, with only the inmem_testing case implemented for now:

import enum
from sqlalchemy import create_engine
from sqlalchemy.orm import Session

class DbCase(enum.Enum):
inmem_testing = enum.auto()
docker_testing = enum.auto()
prod = enum.auto()

def get_session_for_case(case: DbCase) -> Session:
if case == DbCase.inmem_testing:
engine = create_engine('sqlite://')
session = Session(bind=engine)
return session
elif case == DbCase.docker_testing:
raise NotImplementedError("Docker session not implemented yet.")
elif case == DbCase.prod:
raise NotImplementedError("Prod session not implemented yet.")
else:
raise ValueError(f"Unknown case: {case}")

When developing the other ones, we’re going to rely on Env Vars and/or config files to populate the connection with data (username, password, address, etc.).

To test all the cases, we’ve used a feature of pytest called parametrize, but ended up only using it once (since we’ve only implemented one):

@pytest.mark.parametrize(
"db_case",
[
DbCase.inmem_testing,
] # TODO - add more DbCases once we implement them.
)
def test_db_session(db_case: DbCase):
expected = "asdfzcxv"
session = get_session_for_case(db_case)
result = session.execute(sqlalchemy.text(f"select '{expected}'"))
assert expected == result.all()[0][0]

## Developing the models

Following SQLAlchemy’s tutorials, we’ve developed the models:

import enum
from sqlalchemy.orm import registry, relationship, Session
from sqlalchemy import (
Boolean,
Column,
Date,
DateTime,
Enum,
ForeignKey,
Integer,
String,
)

mapper_registry = registry()
Base = mapper_registry.generate_base()

@enum.unique
class PricingTier(enum.Enum):
free = enum.auto()
paid = enum.auto()

TABLE_NAME_WORKSPACES = "workspaces"
TABLE_NAME_SPOTS = "spots"
TABLE_NAME_GARAGES = "garages"
TABLE_NAME_USERS = "users"
TABLE_NAME_RESERVATIONS = "reservations"

class Workspace(Base):
__tablename__ = TABLE_NAME_WORKSPACES
id = Column(Integer, primary_key=True)
slack_id = Column(String)
name = Column(String)
pricing_tier = Column(Enum(PricingTier), nullable=False)
onboard_time = Column(DateTime)

# Setting up Relationships: See
# https://docs.sqlalchemy.org/en/14/orm/relationship_api.html#sqlalchemy.orm.relationship
# https://docs.sqlalchemy.org/en/14/orm/tutorial.html#orm-tutorial-relationship
# https://docs.sqlalchemy.org/en/14/orm/basic_relationships.html#relationship-patterns
# for details.

# workspace --[1..n]-> garages
garages = relationship(
"Garage",
back_populates="workspace",
)
# workspace -[1]--[0..n]-> users
users = relationship(
"User",
back_populates="workspace",
)

def __repr__(self):
# TODO add emoji to repr
return f"Workspace({self.id=}, {self.name=}, {self.pricing_tier=}, {self.onboard_time=})"

class Garage(Base):
__tablename__ = TABLE_NAME_GARAGES
id = Column(Integer, primary_key=True)
name = Column(String)
full_instructions = Column(String)
arrival_instructions = Column(String)

# Garage <-[n..1]-- workspace
workspace_id = Column(
String,
ForeignKey(f'{TABLE_NAME_WORKSPACES}.id'),
)
workspace = relationship(
"Workspace",
back_populates=TABLE_NAME_GARAGES,
)

# Garage --[1..n]-> spots
spots = relationship(
"Spot",
back_populates="garage",
)

def __repr__(self):
return f"Garage {self.name} at {repr(self.workspace)}"

class SpotAttributes(enum.Enum):
accessible = enum.auto()
charger = enum.auto()
scooter = enum.auto()
wide = enum.auto()

class Spot(Base):
__tablename__ = TABLE_NAME_SPOTS
id = Column(Integer, primary_key=True)
name = Column(String, nullable=False)
# TODO: Figure out how to set up attributes correctly. Array data type
# isn't support in basic SQL (like sqlite), so that's not a good solution.
# options include:
# - Adding a SpotAttributes table
# - JSON list
# - "bitwise"/"flags" based on enum value
attributes = Column(String, nullable=True)

# Spot <-[n..1]-- garage
garage_id = Column(
String,
ForeignKey(f"{TABLE_NAME_GARAGES}.id"),
)
garage = relationship(
"Garage",
back_populates=TABLE_NAME_SPOTS
)

# Spot -[1]--[1..n]-> reservations
reservations = relationship(
"Reservation",
back_populates="spot",
)

def __repr__(self):
return f"Spot {self.name} at {self.garage}"

class User(Base):
__tablename__ = TABLE_NAME_USERS
id = Column(Integer, primary_key=True)
slack_id = Column(String)
name = Column(String)
# Which garage to give this user by default? The ID of the garage.
default_garage = Column(Integer, nullable=True)
spot_preferences = Column(String)
spot_requirements = Column(String)

# Workspace --[1..n]-> User
workspace_id = Column(
String,
ForeignKey(f'{TABLE_NAME_WORKSPACES}.id'),
)
workspace = relationship(
"Workspace",
back_populates=TABLE_NAME_USERS,
)

class Reservation(Base):
__tablename__ = TABLE_NAME_RESERVATIONS
id = Column(Integer, primary_key=True)
date = Column(Date, nullable=False)
active = Column(Boolean, nullable=False)
member_id = Column(String, nullable=False)
member_name = Column(String, nullable=True)

# Reservation -[1]--[n]-> spot (multiple reservations can be made for the
# same spot, on different dates).
spot_id = Column(
Integer,
ForeignKey(f'{TABLE_NAME_SPOTS}.id'),
)
spot = relationship(
"Spot",
back_populates=TABLE_NAME_RESERVATIONS,
)

def __repr__(self):
return f"Reservation: \
{'Active' if self.active else 'Inactive'} \
by {self.member_id} \
({self.member_name if self.member_name else 'unknown name'}) \
for {self.date} spot {repr(self.spot)}"

We’ve also writted some accompanied tests, to demonstrate usage (and prove that the schema was implemented successfully). In order to write the tests, we had to develop a pytest.fixture to provide a session with data for each test class. Here’s how the fixture looks, mostly setting up data:

DEMO_WORKSPACE_NAME = "NiceFam"
DEMO_WORKSPACE_SLACK_ID = "T02CPGASL8Y"
DEMO_MEMBER_ID = "U02C63W148L"
DEMO_RESERVATION_DATE = date(1994, 10, 19)

@pytest.fixture(scope="class")
def session_with_demo_models(request):
session: Session = get_session_for_case(DbCase.inmem_testing)
create_all_tables(session)

# first, set up a workspace
w1 = Workspace(
slack_id=DEMO_WORKSPACE_SLACK_ID,
name=DEMO_WORKSPACE_NAME,
pricing_tier=PricingTier.paid,
onboard_time=datetime.now()
)
g1 = Garage(
name="Main garage",
full_instructions="it's full, you're fucked.",
arrival_instructions="just get here",
)
spots = [Spot(name="1"), Spot(name="2"), Spot(name="3")]
for s in spots:
g1.spots.append(s)
w1.garages.append(g1)

r1 = Reservation(
date=DEMO_RESERVATION_DATE,
active=True,
member_id=DEMO_MEMBER_ID,
spot=spots[0],
)
r2 = Reservation(
date=datetime.now(),
active=False,
member_id=DEMO_MEMBER_ID,
spot=spots[0],
)
session.flush()
# put the session in the class that uses this fixture
request.cls.session = session
return session

And here’s how the test class that uses the fixture looks:

@pytest.mark.usefixtures("session_with_demo_models")
class TestAllModelsTogether:
def test_selecting_reservations(self, session_with_demo_models):
result = session_with_demo_models.execute(
select(Reservation).where(
Reservation.date == DEMO_RESERVATION_DATE
)
)
reservations = result.scalars().all()
assert len(reservations) == 1
logger.info(repr(reservations[0]))

result = session_with_demo_models.execute(
select(Reservation)
)
reservations = result.scalars().all()
assert len(reservations) == 2
logger.info(repr(reservations))

def test_reservation_relationships(self, session_with_demo_models: Session):
a_single_reservation = session_with_demo_models.execute(
select(Reservation)
).scalars().first()

assert a_single_reservation.spot.garage.workspace.slack_id == DEMO_WORKSPACE_SLACK_ID

As you can see, for the fixture, we had to also write a utility function to create the tables, in order to avoid the OperationalError: Table xxx not found error:

def create_all_tables(session: Session):
"""Creates all the required tables in the DB, in case it's un-initialized.

Safe to run if tables already exist."""
Base.metadata.create_all(session.get_bind())

## What’s next?

Well, we’ve done some more work around other parts of the code and set up a website. The next thing is integration and making sure it “all works together”, and then actually setting it up for a client. Let’s hope we find the time! :)

1805 Words

2021-10-31 19:04 +0200