I have been working as a backend Python engineer for several years now.
During this time, I have learned a lot about writing clean code, applying algorithms in real-world scenarios, working with both relational and non-relational databases, and most importantly, writing effective tests. These skills have allowed me to save significant time on tasks and ensure that the features I implement are reliable.
Throughout my career as a software developer, I've encountered various approaches to testing.
In this article, I would like to share which practices I found to be inefficient and demonstrate how easy it is to create reliable unit tests that ensure both high coverage and robustness. This article may interest not only developers working with Python but also software engineers across the board.
Tests are generally considered to be code that tests other code. Typically, tests are divided into two groups: unit tests and integration tests.
There are differing opinions on how to categorize tests within these groups.
Some argue that only tests for small portions of code should be considered unit tests, while more complex tests should always be classified as integration tests. I endorse the notion that unit tests can involve testing multiple parts of the code, whereas integration testing focuses on the whole modules, such as services, that work together through an interface. I refer to these as unit tests with real dependencies.
Moreover, in my experience, I haven't worked on a project where developers wrote tests for each individual method or small block of code. Instead, this approach allows for testing larger sections of code without the need to write separate tests for every single method, as they are still covered when tested together.
For the purposes of this article, however, I will simply refer to these as tests, because regardless of terminology, the important thing is to have them.
Throughout my career, I've worked on various projects. In some cases, I joined teams that had already been developing their projects for some time.
I've seen different implementations of tests; some of which proved to be unreliable. In this part of the article, I'll try to summarize these cases with code examples and discuss why such implementations have flaws.
For example, let's consider a simple FastAPI application with several methods that fetch data from the database, add data to the database, and update it.
from fastapi import FastAPI, Body, HTTPException
from core.db import queries
from core import schemas
app = FastAPI()
@app.get(
"/items",
summary="Get items",
status_code=200,
response_model=list[schemas.ItemSchema],
)
def get_items() -> list[schemas.ItemSchema]:
items = queries.get_items()
return items
@app.post(
"/items",
summary="Add items",
status_code=200,
response_model=list[schemas.ItemSchema],
)
def add_items(
items: list[schemas.ItemBaseSchema] = Body(
...,
embed=True,
)
) -> list[schemas.ItemSchema]:
added_items = queries.add_items(
items=items
)
return added_items
@app.patch(
"/items/{item_id}",
summary="Update an item",
status_code=200,
response_model=schemas.ItemSchema,
)
def update_item(
item_id: int,
update_data: schemas.ItemBaseSchema = Body(
...,
embed=True,
)
) -> None:
if queries.get_item(
item_id=item_id
) is None:
raise HTTPException(status_code=404, detail="Item not found")
item = queries.update_item(
item_id=item_id,
update_data=update_data,
)
return item
@app.delete(
"/items/{item_id}",
summary="Delete an item",
status_code=204,
)
def delete_item(
item_id: int
) -> None:
if queries.get_item(
item_id=item_id
) is None:
raise HTTPException(status_code=404, detail="Item not found")
queries.delete_item(
item_id=item_id
)
As you can see, this is a simple API with CRUD operations.
I will present examples of tests for this API that I've encountered during my career and discuss the drawbacks of such approaches. In this example, I will combine common testing practices that can lead to issues during testing.
import json
import pytest
from fastapi import status
from sqlalchemy import create_engine
from sqlalchemy.orm import sessionmaker
from core.db.models import Item
@pytest.fixture(scope='session')
def db_item(test_db_url, setup_db, setup_db_tables):
engine = create_engine(
test_db_url,
echo=False,
echo_pool=False,
)
session = sessionmaker(autocommit=False, autoflush=False, bind=engine)
with session() as session:
item = Item(
name='name',
number=1,
is_valid=True,
)
session.add(
item
)
session.commit()
session.refresh(item)
return item.as_dict()
def test_get_items(
fastapi_test_client,
db_item,
):
response = fastapi_test_client.get(
'/items',
)
assert response.status_code == status.HTTP_200_OK
def test_post_items(
fastapi_test_client,
):
item_to_add = {
'name': 'name',
'number': 1,
'is_valid': False,
}
response = fastapi_test_client.post(
'/items',
data=json.dumps(
{
'items': [
item_to_add
],
},
default=str,
),
)
assert response.status_code == status.HTTP_200_OK
def test_update_item(
fastapi_test_client,
db_item,
):
update_data = {
'name': 'new_name',
'number': 2,
'is_valid': True,
}
response = fastapi_test_client.patch(
f'/items/{db_item["id"]}',
data=json.dumps(
{
'update_data': update_data,
},
default=str,
),
)
assert response.status_code == status.HTTP_200_OK
def test_delete_item(
fastapi_test_client,
db_item,
):
response = fastapi_test_client.delete(
f'/items/{db_item["id"]}',
)
assert response.status_code == status.HTTP_204_NO_CONTENT
At first glance, these tests seem fine. They cover all the APIs, test responses, and achieve high overall coverage. But are they really? Let's discuss how these tests are run.
It’s important to note that setup_db
, setup_db_tables
, and db_item
have a session
scope,
meaning these fixtures are destroyed only at the end of the test session—after all the tests have been completed.
The order of tests execution is as follows:
The test database is created if it doesn't already exist for the entire test run.
Test database tables are created if they don't already exist for the entire test run.
A test item object is created in the test database.
The API tests are executed.
The test tables and database are destroyed.
There are several issues with how these tests are designed.
The first flaw is that the database and test object are created only once for the entire test run, which can lead to potential issues with data consistency throughout the tests.
In this setup, all the tests depend on each other.
For instance, if we add a new test to fetch data from the database and it runs after the DELETE
API test,
it could potentially fail because there would be no test data left in the database.
Even though there's a POST
method that runs before the DELETE
one, there's a risk that it might be moved or deleted, leading to test failures.
The second flaw is that both the test object in the db_item
fixture and the data added in the test_post_items
test are hardcoded. This approach works until there's a conflict in the database.
Currently, there are no constraints
, except for Item.id
(the primary key), set in the database.
However, if constraints were to be added in the future, these tests might fail because they rely on hardcoded values that don’t account for potential conflicts.
This issue once again highlights the dependency of these tests on each other.
The final flaw is that none of these tests actually verify that the methods being tested work correctly.
The only thing being checked is whether the response code is as expected, without confirming that data was actually changed in the database or correctly retrieved from it.
As it stands, there is no way to be certain that the methods function correctly just by running these tests.
The only way to ensure everything works as expected is to combine manual QA with automatic testing.
These drawbacks can be summarized into three main points:
Earlier, we saw that simply writing tests is sometimes not enough to ensure the robustness of a system.
The good news is that Python, along with other programming languages, provides tools to enhance test reliability.
Additionally, what can't be covered by tools can often be addressed by following simple best practices.
The first key to making tests reliable is ensuring they are independent whenever possible.
One test should not affect another, meaning they should have separate data, variables, and so on.
Changes in one test should not break or alter the execution of any others.
Therefore, when using any data source in an application, it is crucial to flush all data before each test to ensure that no artifacts are left for the subsequent tests.
In our example, it can achieved by changing the scope of the fixture setup_db_tables
from session
to function
. In result, the fixture will look like as:
@pytest.fixture(scope="function")
def setup_db_tables(setup_db, test_db_url):
create_db_engine = create_engine(test_db_url)
BaseModel.metadata.create_all(bind=create_db_engine)
yield
BaseModel.metadata.drop_all(bind=create_db_engine)
The second step is to create test data within the tests themselves when needed.
The Factory boy package is useful for generating unique data on the fly and offers the option to create objects directly in the database.
With a slight modification to its default Factory
class, we can create a custom factory that adds objects to the database:
import factory
from tests import conftest
class CustomSQLAlchemyModelFactory(factory.Factory):
class Meta:
abstract = True
@classmethod
def _create(cls, model_class, *args, **kwargs):
with conftest.db_test_session() as session:
session.expire_on_commit = False
obj = model_class(*args, **kwargs)
session.add(obj)
session.commit()
session.expunge_all()
return obj
It only requires inheriting from the custom factory class to create model objects in the database.
class ItemModelFactory(CustomSQLAlchemyModelFactory):
class Meta:
model = models.Item
name = factory.Faker("word")
number = factory.Faker("pyint")
is_valid = factory.Faker("boolean")
Finally, checking that a method doesn't return any errors is not enough.
We need to verify the results of each tested method and any internal changes they might make, such as modifications to the database in our case.
Additionally, it is important to clearly define what is expected from the tests.
After making slight modifications to follow the steps provided earlier, achieving reliability becomes straightforward. The test will look like this:
import json
from unittest.mock import ANY
import pytest
from fastapi import status
from sqlalchemy import create_engine
from sqlalchemy.orm import sessionmaker
from core.db.models import Item
from tests import factories
def test_get_items(
fastapi_test_client
):
expected_items = factories.models_factory.ItemModelFactory.create_batch(
size=5
)
response = fastapi_test_client.get(
'/items',
)
assert response.status_code == status.HTTP_200_OK
response_data = response.json()
assert response_data == [
{
'id': item.id,
'name': item.name,
'number': item.number,
'is_valid': item.is_valid,
} for item in expected_items
]
def test_post_items(
fastapi_test_client,
test_db_session,
):
assert test_db_session.query(Item).first() is None
item_to_add = factories.schemas_factory.ItemBaseSchemaFactory.create()
response = fastapi_test_client.post(
'/items',
data=json.dumps(
{
'items': [
item_to_add.dict()
],
},
default=str,
),
)
assert response.status_code == status.HTTP_200_OK
response_data = response.json()
assert response_data == [
{
'id': ANY,
'name': item_to_add.name,
'number': item_to_add.number,
'is_valid': item_to_add.is_valid,
},
]
assert test_db_session.query(Item).filter(
Item.name == item_to_add.name,
Item.number == item_to_add.number,
Item.is_valid == item_to_add.is_valid
).first()
def test_update_item(
fastapi_test_client,
test_db_session,
):
item = factories.models_factory.ItemModelFactory.create()
update_data = factories.schemas_factory.ItemBaseSchemaFactory.create()
response = fastapi_test_client.patch(
f'/items/{item.id}',
data=json.dumps(
{
'update_data': update_data.dict(),
},
default=str,
),
)
assert response.status_code == status.HTTP_200_OK
response_data = response.json()
assert response_data == {
'id': ANY,
'name': update_data.name,
'number': update_data.number,
'is_valid': update_data.is_valid,
}
assert test_db_session.query(Item).filter(
Item.name == update_data.name,
Item.number == update_data.number,
Item.is_valid == update_data.is_valid
).first()
def test_delete_item(
fastapi_test_client,
test_db_session,
):
item = factories.models_factory.ItemModelFactory.create()
response = fastapi_test_client.delete(
f'/items/{item.id}',
)
assert response.status_code == status.HTTP_204_NO_CONTENT
assert test_db_session.query(Item).first() is None
In this article, we explored common issues with testing practices in software development, particularly focusing on a FastAPI
application example.
We identified several flaws in existing tests, including dependency between tests, reliance on hardcoded values, and the lack of verification of method functionality. To address these issues, we discussed best practices such as using fixtures with function scope to ensure data isolation, utilizing tools like Factory Boy
to generate unique test data, and verify both the results and any internal changes made by the methods being tested.
By implementing these practices, we can enhance the reliability of tests and ensure that the system behaves as expected.
Here are the key points on how to write effective tests:
I hope you found the article useful. You can clone the repository with the example provided (GitHub).