import random
def croissant():
"""A very flaky function."""
if round(random.uniform(0, 1)) == 1:
return True
else:
raise Exception("Flaky test detected!")
Custom Marks With Pytest in Plain English

“The only flake I want is that of a croissant.” Mariel Feldman.
Introduction
pytest
is a testing package for the python framework. It is broadly used to quality assure code logic. This article discusses custom marks, use cases and patterns for selecting and deselecting marked tests. This blog is the fifth and final in a series of blogs called pytest in plain English, favouring accessible language and simple examples to explain the more intricate features of the pytest
package.
For a wealth of documentation, guides and how-tos, please consult the pytest
documentation.
What are pytest
Custom Marks?
Marks are a way to conditionally run specific tests. There are a few marks that come with the pytest
package. To view these, run pytest --markers
from the command line. This will print a list of the pre-registered marks available within the pytest
package. However, it is extremely easy to register your own markers, allowing greater control over which tests get executed.
This article will cover:
- Reasons for marking tests
- Registering marks with
pytest
- Marking tests
- Including or excluding markers from the command line
Intended Audience
Programmers with a working knowledge of python and some familiarity with pytest
and packaging. The type of programmer who has wondered about how to follow best practice in testing python code.
What You’ll Need:
Preparation
This blog is accompanied by code in this repository. The main branch provides a template with the minimum structure and requirements expected to run a pytest
suite. The repo branches contain the code used in the examples of the following sections.
Feel free to fork or clone the repo and checkout to the example branches as needed.
The example code that accompanies this article is available in the marks branch of the repo.
Overview
Occasionally, we write tests that are a bit distinct to the rest of our test suite. They could be integration tests, calling on elements of our code from multiple modules. They could be end to end tests, executing a pipeline from start to finish. Or they could be a flaky or brittle sort of test, a test that is prone to failure on specific operating systems, architectures or when external dependencies do not provide reliable inputs.
There are multiple ways to handle these kinds of tests, including mocking, as discussed in my previous blog. Mocking can often take a bit of time, and developers don’t always have that precious commodity. So instead, they may mark the test, ensuring that it doesn’t get run on continuous integration (CI) checks. This may involve flagging any flaky test as “technical debt” to be investigated and fixed later.
In fact, there are a number of reasons that we may want to selectively run elements of a test suite. Here is a selection of scenarios that could benefit from marking.
Category | Cause | Explanation |
---|---|---|
External Dependencies | Network | Network latency, outages, or Domain Name System (DNS) issues. |
Web APIs | Breaking changes, rate limits, or outages. | |
Databases | Concurrency issues, data changes, or connection problems. | |
Timeouts | Hardcoded or too-short timeouts cause failures. | |
Environment Dependencies | Environment Variables | Incorrectly set environment variables. |
File System | File locks, permissions, or missing files. | |
Resource Limits | Insufficient CPU, memory, or disk space. | |
State Dependencies | Shared State | Interference between tests sharing state. |
Order Dependency | Tests relying on execution order. | |
Test Data | Random Data | Different results on each run due to random data and seed not set. |
Concurrency Issues | Parallel Execution | Tests not designed for parallel execution. |
Locks | Deadlocks or timeouts involving locks or semaphores. | |
Race Conditions | Tests depend on the order of execution of threads or processes. | |
Async Operations | Improperly awaited asynchronous code. | |
Hardware and System Issues | Differences in Hardware | Variations in performance across hardware or operating systems. |
System Load | Failures under high system load due to resource contention. | |
Non-deterministic Inputs | Time | Variations in current time affecting test results. |
User Input | Non-deterministic user input causing flaky behaviour. | |
Filepaths | CI runner filepaths may be hard to predict. | |
Test Implementation Issues | Assertions | Incorrect or overly strict assertions. |
Setup and Teardown | Inconsistent state due to improper setup or teardown. |
In the case of integration tests, one approach may be to group them all together and have them execute within a dedicated CI workflow. This is common practice as developers may want to stay alert to problems with external resources that their code depends upon, while not failing ‘core’ CI checks about changes to the source code. If your code relies on a web API; for instance; you’re probably less concerned about temporary outages in that service. However, a breaking change to that service would require our source code to be adapted. Once more, life is a compromise.
“Le mieux est l’ennemi du bien.” (The best is the enemy of the good), Voltaire
Custom Marks in pytest
Marking allows us to have greater control over which of our tests are executed when we invoke pytest
. Marking is conveniently implemented in the following way (presuming you have already written your source and test code):
- Register a custom marker
- Assign the new marker name to the target test
- Invoke
pytest
with the-m
(MARKEXPR) flag.
This section uses code available in the marks branch of the GitHub repository.
Define the Source Code
I have a motley crew of functions to consider. A sort of homage to Sergio Leone’s ‘The Good, The Bad & the Ugly’, although I’ll let you figure out which is which.

The Flaky Function
Here we define a function that will fail half the time. What a terrible test to have. The root of this unpredictable behaviour should be diagnosed as a priority as a matter of sanity.
The Slow Function
This function is going to be pretty slow. Slow test suites throttle our productivity. Once it finishes waiting for a specified number of seconds, it will return a string.
import time
from typing import Union
def take_a_nap(how_many_seconds:Union[int, float]) -> str:
"""Mimic a costly function by just doing nothing for a specified time."""
float(how_many_seconds))
time.sleep(return "Rise and shine!"
The Needy Function
Finally, the needy function will have an external dependency on a website. This test will simply check whether we get a HTTP status code of 200 (ok) when we request any URL.
import requests
def check_site_available(url:str, timeout:int=5) -> bool:
"""Checks if a site is available."""
try:
= requests.get(url, timeout=timeout)
response return True if response.status_code == 200 else False
except requests.RequestException:
return False
The Wrapper
Finally, I’ll introduce a wrapper that will act as an opportunity for an integration test. This is a bit awkward, as none of the above functions are particularly related to each other.
This function will execute the check_site_available()
and take_a_nap()
together. A pretty goofy example, I admit. Based on the status of the url request, a string will be returned.
import time
from typing import Union
import requests
def goofy_wrapper(url:str, timeout:int=5) -> str:
"""Check a site is available, pause for no good reason before summarising
outcome with a string."""
= f"Napping for {timeout} seconds.\n"
msg = msg + take_a_nap(timeout)
msg if check_site_available(url):
= msg + "\nYour site is up!"
msg else:
= msg + "\nYour site is down!"
msg
return msg
Let’s Get Testing
Initially, I will define a test that does nothing other than pass. This will be a placeholder, unmarked test.
def test_nothing():
pass
Next, I import croissant()
and assert that it returns True
. As you may recall from above, croissant()
will do so ~50 % of the time.
from example_pkg.do_something import (
croissant,
)
def test_nothing():
pass
def test_croissant():
assert croissant()
Now running pytest -v
will print the test results, reporting test outcomes for each test separately (-v
means verbose).
% pytest -v
...
============================= test session starts =============================
platform darwin -- Python 3.12.3, pytest-8.1.1, pluggy-1.5.0 -- /...: .pytest_cache
cachedir: /...
rootdir: pyproject.toml
configfile: ./tests
testpaths
collected 2 items ::test_nothing PASSED [ 50%]
tests/test_do_something.py::test_croissant PASSED [100%]
tests/test_do_something.py ============================== 2 passed in 0.05s ==============================
But note that half the time, I will also get the following output:
% pytest -v
...
============================= test session starts =============================
platform darwin -- Python 3.12.3, pytest-8.1.1, pluggy-1.5.0 -- /...: .pytest_cache
cachedir: /...
rootdir: pyproject.toml
configfile: ./tests
testpaths
collected 2 items
::test_nothing PASSED [ 50%]
tests/test_do_something.py::test_croissant FAILED [100%]
tests/test_do_something.py
================================== FAILURES ==================================
_______________________________ test_croissant ________________________________
@pytest.mark.flaky():
def test_croissant()
> assert croissant:17:
tests/test_do_something.py
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _():
def croissant"""A very flaky function."""
(random.uniform(0, 1)) == 1:
if round
return True:
else("Flaky test detected!")
> raise Exception: Flaky test detected!
E Exception
:13: Exception
src/example_pkg/do_something.py
============================short test summary info ===========================::test_croissant - Exception: Flaky test detected!
FAILED ... ========================= 1 failed, 1 passed in 0.07s =========================
To prevent this flaky test from failing the test suite, we can choose to mark it as flaky, and optionally skip it when invoking pytest
. To go about that, we first need to register a new marker. To do that, let’s update out project’s pyproject.toml
to include additional options for a flaky
mark:
# `pytest` configurations[tool.pytest.ini_options]
[
markers = "flaky: tests that can randomly fail through no change to the code",
]
Note that when registering a marker in this way, text after the colon is an optional mark description. Saving the document and running pytest --markers
should show that a new custom marker is available to our project:
% pytest --markers
... : tests that can randomly fail through no change to the code
@pytest.mark.flaky ...
Now that we have confirmed our marker is available for use, we can use it to mark test_croissant()
as flaky:
import pytest
from example_pkg.do_something import (
croissant,
)
def test_nothing():
pass
@pytest.mark.flaky
def test_croissant():
assert croissant()
Note that we need to import pytest
to our test module in order to use the pytest.mark.<MARK_NAME>
decorator.
Selecting a Single Mark
Now that we have registered and marked a test as flaky
, we can adapt our pytest
call to execute tests with that mark only. The pattern we will use is:
pytest -v -m "<INSERT_MARK_NAME>"
% pytest -v -m "flaky"
...
============================= test session starts =============================
platform darwin -- Python 3.12.3, pytest-8.1.1, pluggy-1.5.0 -- /...: .pytest_cache
cachedir: /...
rootdir: pyproject.toml
configfile: ./tests
testpaths
collected 2 items / 1 deselected / 1 selected
::test_croissant PASSED [100%]
tests/test_do_something.py
======================= 1 passed, 1 deselected in 0.05s =======================
Now we see that test_croissant()
was executed, while the unmarked test_nothing()
was not.
Deselecting a Single Mark
More useful than selectively running a flaky test is to deselect it. In this way, it cannot fail our test suite. This is achieved with the following pattern:
pytest -v -m "not <INSERT_MARK_NAME>"
% pytest -v -m "not flaky"
...
============================= test session starts =============================
platform darwin -- Python 3.12.3, pytest-8.1.1, pluggy-1.5.0 -- /...: .pytest_cache
cachedir: /...
rootdir: pyproject.toml
configfile: ./tests
testpaths
collected 2 items / 1 deselected / 1 selected
::test_nothing PASSED [100%]
tests/test_do_something.py
======================= 1 passed, 1 deselected in 0.05s =======================
Note that this time, test_flaky()
was not executed.
Selecting Multiple Marks
In this section, we will introduce another, differently marked test to illustrate the syntax for running multiple marks. For this example, we’ll test take_a_nap()
:
import pytest
from example_pkg.do_something import (
croissant,
take_a_nap,
)
def test_nothing():
pass
@pytest.mark.flaky
def test_croissant():
assert croissant()
@pytest.mark.slow
def test_take_a_nap():
= take_a_nap(how_many_seconds=3)
out assert isinstance(out, str), f"a string was not returned: {type(out)}"
assert out == "Rise and shine!", f"unexpected string pattern: {out}"
Our new test just makes some simple assertions about the string take_a_nap()
returns after snoozing. But notice what happens when running pytest -v
now:
% pytest -v
...
============================= test session starts =============================
platform darwin -- Python 3.12.3, pytest-8.1.1, pluggy-1.5.0 -- /...: .pytest_cache
cachedir: /...
rootdir: pyproject.toml
configfile: ./tests
testpaths
collected 3 items
::test_nothing PASSED [ 33%]
tests/test_do_something.py::test_croissant PASSED [ 66%]
tests/test_do_something.py::test_take_a_nap PASSED [100%]
tests/test_do_something.py
============================== 3 passed in 3.07s ==============================
The test suite now takes in excess of 3 seconds to execute, as the test specified for take_a_nap()
to sleep for that period. Let’s update our pyproject.toml
and register a new mark:
# `pytest` configurations[tool.pytest.ini_options]
[
markers = "flaky: tests that can randomly fail through no change to the code",
"slow: marks tests as slow (deselect with '-m \"not slow\"')",
]
Note that the nested speech marks within the description of the slow
mark were escaped. pytest
would have complained that the toml file was not valid unless we ensured it was valid toml syntax.
In order to run tests marked with either flaky
or slow
, we can use or
:
pytest -v -m "<INSERT_MARK_1> or <INSERT_MARK_2>"
% pytest -v -m "flaky or slow"
...
============================= test session starts =============================
platform darwin -- Python 3.12.3, pytest-8.1.1, pluggy-1.5.0 -- /...: .pytest_cache
cachedir: /...
rootdir: pyproject.toml
configfile: ./tests
testpaths
collected 3 items / 1 deselected / 2 selected
::test_croissant PASSED [ 50%]
tests/test_do_something.py::test_take_a_nap PASSED [100%]
tests/test_do_something.py
======================= 2 passed, 1 deselected in 3.06s =======================
Note that anything not marked with flaky
or slow
(eg test_nothing()
) was not run. Also, test_croissant()
failed 3 times in a row while I tried to get a passing run. I didn’t want the flaky exception to carry on presenting itself. While I may be sprinkling glitter, I do not want to misrepresent how frustrating flaky tests can be!
Complex Selection Rules
By adding an additional mark, we can illustrate more complex selection and deselection rules for invoking pytest
. Let’s write an integration test that checks whether the domain for this blog site can be reached.
import pytest
from example_pkg.do_something import (
croissant,
take_a_nap,
check_site_available,
)
def test_nothing():
pass
@pytest.mark.flaky
def test_croissant():
assert croissant()
@pytest.mark.slow
def test_take_a_nap():
= take_a_nap(how_many_seconds=3)
out assert isinstance(out, str), f"a string was not returned: {type(out)}"
assert out == "Rise and shine!", f"unexpected string pattern: {out}"
@pytest.mark.integration
def test_check_site_available():
= "https://thedatasavvycorner.com/"
url assert check_site_available(url), f"site {url} is down..."
Now updating our pyproject.toml
like so:
# `pytest` configurations[tool.pytest.ini_options]
[
markers = "flaky: tests that can randomly fail through no change to the code",
"slow: marks tests as slow (deselect with '-m \"not slow\"')",
"integration: tests that require external resources",
]
Now we can combine and
and not
statements when calling pytest
to execute just the tests we need to. In the below, I choose to run the slow
and integration
tests while excluding that pesky flaky
test.
% pytest -v -m "slow or integration and not flaky"
...
============================= test session starts =============================
platform darwin -- Python 3.12.3, pytest-8.1.1, pluggy-1.5.0 -- /...: .pytest_cache
cachedir: /...
rootdir: pyproject.toml
configfile: ./tests
testpaths
collected 4 items / 2 deselected / 2 selected
::test_take_a_nap PASSED [ 50%]
tests/test_do_something.py::test_check_site_available PASSED [100%]
tests/test_do_something.py
======================= 2 passed, 2 deselected in 3.29s =======================
Note that both test_nothing()
(unmarked) and test_croissant()
(deselected) were not run.
Marks and Test Classes
Note that so far, we have applied marks to test functions only. But we can also apply marks to an entire test class, or even target specific test modules. For this section, I will introduce the wrapper function introduced earlier and use a test class to group its tests together. I will mark those tests with 2 new marks, classy
and subclassy
.
# `pytest` configurations[tool.pytest.ini_options]
[
markers = "flaky: tests that can randomly fail through no change to the code",
"slow: marks tests as slow (deselect with '-m \"not slow\"')",
"integration: tests that require external resources",
"classy: tests arranged in a class",
"subclassy: test methods",
]
Updating our test module to include test_goofy_wrapper()
:
import pytest
from example_pkg.do_something import (
croissant,
take_a_nap,
check_site_available,
goofy_wrapper
)
def test_nothing():
pass
@pytest.mark.flaky
def test_croissant():
assert croissant()
@pytest.mark.slow
def test_take_a_nap():
= take_a_nap(how_many_seconds=3)
out assert isinstance(out, str), f"a string was not returned: {type(out)}"
assert out == "Rise and shine!", f"unexpected string pattern: {out}"
@pytest.mark.integration
def test_check_site_available():
= "https://thedatasavvycorner.com/"
url assert check_site_available(url), f"site {url} is down..."
@pytest.mark.classy
class TestGoofyWrapper:
@pytest.mark.subclassy
def test_goofy_wrapper_url_exists(self):
assert goofy_wrapper(
"https://thedatasavvycorner.com/", 1
"Your site is up!"), "The site wasn't up."
).endswith(@pytest.mark.subclassy
def test_goofy_wrapper_url_does_not_exist(self):
assert goofy_wrapper(
"https://thegoofycorner.com/", 1
"Your site is down!"), "The site wasn't down." ).endswith(
Note that targeting either the classy
or subclassy
mark results in the same output - all tests within this test class are executed:
% pytest -v -m "classy"
...
============================= test session starts =============================
platform darwin -- Python 3.12.3, pytest-8.1.1, pluggy-1.5.0 -- /...: .pytest_cache
cachedir: /...
rootdir: pyproject.toml
configfile: ./tests
testpaths
collected 6 items / 4 deselected / 2 selected
::test_goofy_wrapper_url_exists PASSED [ 50%]
TestGoofyWrapper::test_goofy_wrapper_url_does_not_exist PASSED [100%]
TestGoofyWrapper
======================= 2 passed, 4 deselected in 2.30s =======================
Nobody created the domain https://thegoofycorner.com/
yet, such a shame.
Tests with Multiple Marks
Note that we can use multiple marks with any test or test class. Let’s update TestGoofyWrapper
to be marked as integration
& slow
:
@pytest.mark.slow
@pytest.mark.integration
@pytest.mark.classy
class TestGoofyWrapper:
@pytest.mark.subclassy
def test_goofy_wrapper_url_exists(self):
assert goofy_wrapper(
"https://thedatasavvycorner.com/", 1
"Your site is up!"),"The site wasn't up."
).endswith(@pytest.mark.subclassy
def test_goofy_wrapper_url_does_not_exist(self):
assert goofy_wrapper(
"https://thegoofycorner.com/", 1
"Your site is down!"), "The site wasn't down." ).endswith(
This test class can now be exclusively targeted by specifying multiple marks with and
:
pytest -v -m "<INSERT_MARK_1> and <INSERT_MARK2>... and <INSERT_MARK_N>"
% pytest -v -m "integration and slow"
...
============================= test session starts =============================
platform darwin -- Python 3.12.3, pytest-8.1.1, pluggy-1.5.0 -- /...: .pytest_cache
cachedir: /...
rootdir: pyproject.toml
configfile: ./tests
testpaths
collected 6 items / 4 deselected / 2 selected
::test_goofy_wrapper_url_exists PASSED [ 50%]
TestGoofyWrapper::test_goofy_wrapper_url_does_not_exist PASSED [100%]
TestGoofyWrapper
======================= 2 passed, 4 deselected in 2.30s =======================
Note that even though there are other tests marked with integration
and slow
separately, they are excluded on the basis that and
expects them to be marked with both.
Deselecting All Marks
Now that we have introduced multiple custom markers to our test suite, what if we want to exclude all of these marked tests, just running the ‘core’ test suite? Unfortunately, there is not a way to specify ‘unmarked’ tests. There is an old pytest
plugin called pytest-unmarked
that allowed this functionality. Unfortunately, this plugin is not being actively maintained and is not compatible with pytest
v8.0.0+. You could introduce a ‘standard’ or ‘core’ marker, but you’d need to remember to mark every unmarked test within your test suite with it.
Alternatively, what we can do is exclude each of the marks that have been registered. There are 2 patterns for achieving this:
pytest -v -m "not <INSERT_MARK_1> ... or not <INSERT_MARK_N>"
pytest -v -m "not (<INSERT_MARK_1> ... or <INSERT_MARK_N>)"
% pytest -v -m "not (flaky or slow or integration)"
...
============================= test session starts =============================
platform darwin -- Python 3.12.3, pytest-8.1.1, pluggy-1.5.0 -- ...: .pytest_cache
cachedir: /...
rootdir: pyproject.toml
configfile: ./tests
testpaths
collected 6 items / 5 deselected / 1 selected
::test_nothing PASSED [100%]
tests/test_do_something.py
======================= 1 passed, 5 deselected in 0.05s =======================
Note that using or
has greedily excluded any test marked with at least one of the specified marks.
Summary
Registering marks with pytest
is very easy and is useful for controlling which tests are executed. We have illustrated:
- registering marks
- marking tests and test classes
- the use of the
pytest -m
flag - selection of multiple marks
- deselection of multiple marks
Overall, this feature of pytest
is simple and intuitive. There are more options for marking tests. I recommend reading the pytest
custom markers examples for more information.
As mentioned earlier, this is the final in the pytest in plain English series. I will be taking a break from blogging about testing for a while. But colleagues have asked about articles on property-based testing and some of the more useful pytest
plug-ins. I plan to cover these topics at a later date.
If you spot an error with this article, or have a suggested improvement then feel free to raise an issue on GitHub.
Happy testing!
Acknowledgements
To past and present colleagues who have helped to discuss pros and cons, establishing practice and firming-up some opinions. Particularly:
- Ethan
- Sergio
fin!