"""Demonstrating tmp_path & tmp_path_factory with a simple txt file."""
from pathlib import Path
from typing import Union
def _update_a_term(
str], target_pattern:str, replacement:str) -> str:
txt_pth: Union[Path, """Replace the target pattern in a body of text.
Parameters
----------
txt_pth : Union[Path, str]
Path to a txt file.
target_pattern : str
The pattern to replace.
replacement : str
The replacement value.
Returns
-------
str
String with any occurrences of target_pattern replaced with specified
replacement value.
"""
with open(txt_pth, "r") as f:
= f.read()
txt
f.close()return txt.replace(target_pattern, replacement)
Pytest With tmp_path
in Plain English

Introduction
pytest
is a testing package for the python framework. It is broadly used to quality assure code logic. This article discusses why and how we use pytest’s temporary fixtures tmp_path
and tmp_path_factory
. This blog is the second in a series of blogs called pytest in plain English, favouring accessible language and simple examples to explain the more intricate features of the pytest
package.
For a wealth of documentation, guides and how-tos, please consult the pytest
documentation.
Intended Audience
Programmers with a working knowledge of python and some familiarity with pytest
and packaging. The type of programmer who has wondered about how to follow best practice in testing python code.
What You’ll Need:
Preparation
This blog is accompanied by code in this repository. The main branch provides a template with the minimum structure and requirements expected to run a pytest
suite. The repo branches contain the code used in the examples of the following sections.
Feel free to fork or clone the repo and checkout to the example branches as needed.
The example code that accompanies this article is available in the temp-fixtures branch of the repo.
What Are Temporary Fixtures?
In the previous pytest
in plain English article, we discussed how to write our own fixtures to serve data to our tests. But pytest
comes with its own set of fixtures that are really useful in certain situations. In this article, we will consider those fixtures that are used to create temporary directories and files.
Why Do We Need Temporary Fixtures?
If the code you need to test carries out file operations, then there are a few considerations needed when writing our tests. It is best practice in testing to ensure the system state is unaffected by running the test suite. In the very worst cases I have encountered, running the test suite has resulted in timestamped csvs being written to disk every time pytest
was run. As developers potentially run these tests hundreds of times while working on a code base, this thoughtless little side-effect quickly results in a messy file system.
Just to clarify - I’m not saying it’s a bad idea to use timestamped file names. Or to have functions with these kinds of side effects - these features can be really useful. The problem is when the test suite creates junk on your disk that you weren’t aware of…
By using temporary fixtures, we are ensuring the tests are isolated from each other and behave in dependable ways. If you ever encounter a test suite that behaves differently on subsequent runs, then be suspicious of a messy test suite with file operations that have changed the state of the system. In order for us to reason about the state of the code, we need to be able to rely on the answers we get from the tests, known in test engineering speak as determinism.
Let’s Compare the Available Temporary Fixtures
The 2 fixtures that we should be working with as of 2024 are tmp_path
and tmp_path_factory
. Both of these newer temporary fixtures return pathlib.Path
objects and are included with the pytest
package in order to encourage developers to use them. No need to import tempfile
or any other dependency to get what you need, it’s all bundled up with your pytest
installation.
tmp_path
is a function-scoped fixture. Meaning that if we use tmp_path
in 2 unit tests, then we will be served with 2 separate temporary directories to work with. This should meet most developers’ needs. But if you’re doing something more complex with files, there are occasions where you may need a more persistent temporary directory. Perhaps a bunch of your functions need to work sequentially using files on disk and you need to test how all these units work together. This kind of scenario can arise if you are working on really large files where in-memory operations become too costly. This is where tmp_path_factory
can be useful, as it is a session-scoped temporary structure. A tmp_path_factory
structure will be created at the start of a test suite and will persist until teardown happens once the last test has been executed.
Fixture Name | Scope | Teardown after each |
---|---|---|
tmp_path |
function | test function |
tmp_path_factory |
session | pytest session |
What About tmpdir
?
Ah, the eagle-eyed among you may have noticed that the pytest
package contains other fixtures that are relevant to temporary structures. Namely tmpdir
and tmpdir_factory
. These fixtures are older equivalents of the fixtures we discussed above. The main difference is that instead of returning pathlib.Path
objects, they return py.path.local
objects. These fixtures were written before pathlib
had been adopted as the standardised approach to handling paths across multiple operating systems. The future of tmpdir
and tmpdir_factory
have been discussed for deprecation. These fixtures are being sunsetted and it is advised to port old test suites over to the new tmp_path
fixture instead. The pytest
team has provided a utility to help developers identify these issues in their old test suites.
In summary, don’t use tmpdir
any more and consider converting old code if you used it in the past…
How to Use Temporary Fixtures
Writing Source Code
As a reminder, the code for this section is located here.
In this deliberately silly example, let’s say we have a poem sitting on our disk in a text file. Thanks to chatGPT for the poem and MSFT Bing Copilot for the image, making this a trivial consideration. Or should I really thank the millions of people who wrote the content that these services trained on?
Saving the text file in the chunk below to the ./tests/data/
folder is where you would typically save data for your tests.
tests/data/jack-jill-2024.txt
In the realm of data, where Jack and Jill dwell,
They ventured forth, their tale to tell.
But amidst the bytes, a glitch they found,
A challenge profound, in algorithms bound.
Their circuits whirred, their processors spun,
As they analyzed the glitch, one by one.
Yet despite their prowess, misfortune struck,
A bug so elusive, like lightning struck.
Their systems faltered, errors abound,
As frustration grew with each rebound.
But Jack and Jill, with minds so keen,
Refused to let the glitch remain unseen.
With perseverance strong and logic clear,
They traced the bug to its hidden sphere.
And with precision fine and code refined,
They patched the glitch, their brilliance defined.
In the end, though misfortune came their way,
Jack and Jill triumphed, without delay.
For in the realm of AI, where challenges frown,
Their intellect prevailed, wearing victory's crown.
So let their tale inspire, in bytes and code,
Where challenges rise on the digital road.
For Jack and Jill, with their AI might, Showed that even in darkness, there's always light.
Let’s imagine we need a program that can edit the text and write new versions of the poem to disk. Let’s go ahead and create a function that will read the poem from disk and replace any word that you’d like to change.
Now we can try using the function to rename a character in the rhyme, by running the below code in a python shell.
from pyprojroot import here
= _update_a_term(
rhyme =here("data/blogs/jack-jill-2024.txt"),
txt_pth="Jill",
target_pattern="Jock")
replacementprint(rhyme[0:175])
In the realm of data, where Jack and Jock dwell,
They ventured forth, their tale to tell.
But amidst the bytes, a glitch they found,
A challenge profound, in algorithms bound.
You may have noticed that the above function starts with an underscore. This convention means the function is not intended for use by the user. These internal functions would typically have less defensive checks than those you intend to expose to your users. It’s not an enforced thing but is considered good practice. It means “use at your own risk” as internals often have less documentation, may not be directly tested and could be less stable than functions in the api.
Great, next we need a little utility function that will take our text and write it to a file of our choosing.
def _write_string_to_txt(some_txt:str, out_pth:Union[Path, str]) -> None:
"""Write some string to a text file.
Parameters
----------
some_txt : str
The text to write to file.
out_pth : Union[Path, str]
The path to the file.
Returns
-------
None
"""
with open(out_pth, "w") as f:
f.writelines(some_txt) f.close()
Finally, we need a wrapper function that will use the above functions, allowing the user to read in the text file, replace a pattern and then write the new poem to file.
def update_poem(
str],
poem_pth:Union[Path, str,
target_pattern:str,
replacement:str]) -> None:
out_file:Union[Path, """Takes a txt file, replaces a pattern and writes to a new file.
Parameters
----------
poem_pth : Union[Path, str]
Path to a txt file.
target_pattern : str
A pattern to update.
replacement : str
The replacement value.
out_file : Union[Path, str]
A file path to write to.
"""
= _update_a_term(poem_pth, target_pattern, replacement)
txt _write_string_to_txt(txt, out_file)
How do we know it works? We can use it and observe the output, as I did with _update_a_term()
earlier, but this article is about testing. So let’s get to it.
Testing the Source Code
We need to test update_poem()
but it writes files to disk. We don’t want to litter our (and our colleagues’) disks with files every time pytest
runs. Therefore we need to ensure the function’s out_file
parameter is pointing at a temporary directory. In that way, we can rely on the temporary structure’s behaviour on teardown to remove these files when pytest finishes doing its business.
"""Tests for update_poetry module."""
import os
import pytest
from example_pkg import update_poetry
def test_update_poem_writes_new_pattern_to_file(tmp_path):
"""Check that update_poem changes the poem pattern and writes to file."""
= os.path.join(tmp_path, "new_poem.txt")
new_poem_path
update_poetry.update_poem(="tests/data/jack-jill-2024.txt",
poem_pth="glitch",
target_pattern="bug",
replacement=new_poem_path
out_file )
Before I go ahead and add a bunch of assertions in, look at how easy it is to use tmp_path
, blink and you’ll miss it. You simply reference it in the signature of the test where you wish to use it and then you are able to work with it like you would any other path object.
So far in this test function, I specified that I’d like to read the text from a file called jack-jill-2024.txt
, replace the word “glitch” with “bug” wherever it occurs and then write this text to a file called new_poem.txt
in a temporary directory.
Some simple tests for this little function:
- Does the file I asked for exist?
- Are the contents of that file as I expect?
Let’s go ahead and add in those assertions.
"""Tests for update_poetry module."""
import os
import pytest
from example_pkg import update_poetry
def test_update_poem_writes_new_pattern_to_file(tmp_path):
"""Check that update_poem changes the poem pattern and writes to file."""
= os.path.join(tmp_path, "new_poem.txt")
new_poem_path
update_poetry.update_poem(="tests/data/jack-jill-2024.txt",
poem_pth="glitch",
target_pattern="bug",
replacement=new_poem_path
out_file
)# Now for the assertions
assert os.path.exists(new_poem_path)
assert os.listdir(tmp_path) == ["new_poem.txt"]
# let's check what pattern was written - now we need to read in the
# contents of the new file.
with open(new_poem_path, "r") as f:
= f.read()
what_was_written
f.close()assert "glitch" not in what_was_written
assert "bug" in what_was_written
Running pytest
results in the below output.
collected 1 item
tests/test_update_poetry.py . [100%]
============================== 1 passed in 0.01s ==============================
So we prove that the function works how we hoped it would. But what if I want to work with the new_poem.txt
file again in another test function? Let’s add another test to test_update_poetry.py
and see what we get when we try to use tmp_path
once more.
"""Tests for update_poetry module."""
# import statements ...
# def test_update_poem_writes_new_pattern_to_file(tmp_path): ...
def test_do_i_get_a_new_tmp_path(tmp_path):
"""Remind ourselves that tmp_path is function-scoped."""
assert "new_poem" not in os.listdir(tmp_path)
assert os.listdir(tmp_path) == []
As is demonstrated when running pytest
once more, tmp_path
is function-scoped and we have now lost the new poem with the bugs instead of the glitches. Drat! What to do…
collected 2 items
tests/test_update_poetry.py .. [100%]
============================== 2 passed in 0.01s ==============================
As mentioned earlier, pytest
provides another fixture with more flexibility, called tmp_path_factory
. As this fixture is session-scoped, we can have full control over this fixture’s scoping.
For a refresher on the rules of scope referencing, please see the blog Pytest Fixtures in Plain English.
"""Tests for update_poetry module."""
# import statements ...
# def test_update_poem_writes_new_pattern_to_file(tmp_path): ...
# def test_do_i_get_a_new_tmp_path(tmp_path): ...
@pytest.fixture(scope="module")
def _module_scoped_tmp(tmp_path_factory):
yield tmp_path_factory.mktemp("put_poetry_here", numbered=False)
Note that as tmp_path_factory
is session-scoped, I’m free to reference it in another fixture with any scope. Here I define a module-scoped fixture, which means teardown of _module_scoped_tmp
will occur once the final test in this test module completes. Now repeating the logic executed with tmp_path
above, but this time with our new module-scoped temporary directory, we get a different outcome.
"""Tests for update_poetry module."""
# import statements ...
# def test_update_poem_writes_new_pattern_to_file(tmp_path): ...
# def test_do_i_get_a_new_tmp_path(tmp_path): ...
@pytest.fixture(scope="module")
def _module_scoped_tmp(tmp_path_factory):
yield tmp_path_factory.mktemp("put_poetry_here", numbered=False)
def test_module_scoped_tmp_exists(_module_scoped_tmp):
= os.path.join(_module_scoped_tmp, "new_poem.txt")
new_poem_path
update_poetry.update_poem(="tests/data/jack-jill-2024.txt",
poem_pth="glitch",
target_pattern="bug",
replacement=new_poem_path
out_file
)assert os.path.exists(new_poem_path)
with open(new_poem_path, "r") as f:
= f.read()
what_was_written
f.close()assert "glitch" not in what_was_written
assert "bug" in what_was_written
assert os.listdir(_module_scoped_tmp) == ["new_poem.txt"]
def test_do_i_get_a_new_tmp_path_factory(_module_scoped_tmp):
assert not os.listdir(_module_scoped_tmp) == [] # not empty...
assert os.listdir(_module_scoped_tmp) == ["new_poem.txt"]
# module-scoped fixture still contains file made in previous test function
with open(os.path.join(_module_scoped_tmp, "new_poem.txt")) as f:
= f.read()
found_txt
f.close()assert "glitch" not in found_txt
assert "bug" in found_txt
Executing pytest
one final time demonstrates that the same output file written to disk with test_module_scoped_tmp_exists()
is subsequently available for further testing in test_do_i_get_a_new_tmp_path_factory()
.
collected 4 items
tests/test_update_poetry.py .... [100%]
============================== 4 passed in 0.01s ==============================
Note that the order that these 2 tests run in is now important. These tests are no longer isolated and trying to run the second test on its own with pytest -k "test_do_i_get_a_new_tmp_path_factory"
would result in a failure. For this reason, it may be advisable to pop the test functions within a common test class, or even use pytest marks to mark them as integration tests (more on this in a future blog).
Summary
The reasons we use temporary fixtures and how to use them has been demonstrated with another silly (but hopefully relatable) little example. I have not gone into the wealth of methods available in these temporary fixtures, but they have many useful utilities. Maybe you’re working with a complex nested directory structure for example, the glob
method would surely help with that.
Below are the public methods and attributes of tmp_path
:
['absolute', 'anchor', 'as_posix', 'as_uri', 'chmod', 'cwd', 'drive', 'exists',
'expanduser', 'glob', 'group', 'hardlink_to', 'home', 'is_absolute',
'is_block_device', 'is_char_device', 'is_dir', 'is_fifo', 'is_file',
'is_junction', 'is_mount', 'is_relative_to', 'is_reserved', 'is_socket',
'is_symlink', 'iterdir', 'joinpath', 'lchmod', 'lstat', 'match', 'mkdir',
'name', 'open', 'owner', 'parent', 'parents', 'parts', 'read_bytes',
'read_text', 'readlink', 'relative_to', 'rename', 'replace', 'resolve',
'rglob', 'rmdir', 'root', 'samefile', 'stat', 'stem', 'suffix', 'suffixes',
'symlink_to', 'touch', 'unlink', 'walk', 'with_name', 'with_segments',
'with_stem', 'with_suffix', 'write_bytes', 'write_text']
It is useful to read the pathlib.Path
docs as both fixtures return this type and many of the methods above are inherited from these types. To read the tmp_path
and tmp_path_factory
implementation, I recommend reading the tmp docstrings on GitHub.
If you spot an error with this article, or have suggested improvement then feel free to raise an issue on GitHub.
Happy testing!
Acknowledgements
To past and present colleagues who have helped to discuss pros and cons, establishing practice and firming-up some opinions. Particularly:
- Charlie
- Dan
- Edward
- Ian
- Mark
fin!