Filesystem Isolation for Hermetic Tests¶

What is Filesystem Isolation?¶

Filesystem isolation is a test enforcement mechanism that prevents small tests from accessing the filesystem during execution. This ensures tests are hermetic - running entirely in memory with no external dependencies.

When enabled, the pytest-test-categories plugin intercepts filesystem operations and either blocks them or warns about them, depending on your configuration.

Why Filesystem Isolation Matters¶

Tests that access the filesystem introduce several problems:

Side Effects¶

Filesystem-dependent tests create unpredictable side effects:

Files created by one test may persist and affect subsequent tests
Tests may overwrite or delete files needed by other tests
Parallel test execution leads to race conditions on shared files
CI environments may have different filesystem layouts than local development

State Leakage¶

External filesystem state makes tests non-deterministic:

Tests depend on specific files existing at specific paths
Configuration files vary between environments
Data files may change between test runs
Paths are often platform-specific (Windows vs. Unix)

Slow Tests¶

Disk I/O adds latency that compounds across your test suite:

File operations are orders of magnitude slower than memory operations
SSDs are fast but still 1000x slower than RAM
Network filesystems (NFS, CIFS) add significant latency
Disk contention increases as tests run in parallel

Non-Hermeticity¶

Tests that read or write external files are not self-contained:

Cannot run reliably in isolated CI containers
May fail when paths differ between developers
Difficult to parallelize safely
Hard to reproduce failures

Google’s Test Size Definitions¶

The filesystem isolation feature implements Google’s test size definitions from “Software Engineering at Google”:

Test Size	Filesystem Access	Rationale
Small	Blocked (no exceptions)	Must be hermetic, run in memory only
Medium	Allowed	May use local filesystem for integration tests
Large	Allowed	Integration tests may access real filesystems
XLarge	Allowed	End-to-end tests may access real filesystems

Small Tests¶

Small tests are the foundation of a healthy test suite. They must be:

Fast: Complete in under 1 second
Hermetic: No external dependencies
Deterministic: Same input always produces same output
Parallelizable: Safe to run concurrently with other tests

Filesystem isolation enforces strict hermeticity by blocking ALL filesystem access in small tests. There are no exceptions - if a test needs filesystem access, it should use @pytest.mark.medium or mock the filesystem with pyfakefs or io.StringIO/io.BytesIO.

Philosophy: No Escape Hatches

The “no escape hatches” philosophy means:

If a test needs filesystem access AT ALL, it’s not a small test
Small tests must be pure - no I/O of any kind
tmp_path is still filesystem I/O, even if it’s “isolated”
Tests needing file operations should use @pytest.mark.medium or mock with pyfakefs/io.StringIO

Medium, Large, and XLarge Tests¶

Medium, large, and XLarge tests may access the filesystem freely, enabling:

File-based integration tests
Configuration file parsing tests
Log file generation tests
Data import/export tests

How It Works¶

The plugin intercepts filesystem operations by patching Python’s built-in functions and modules:

Patched Entry Points¶

The following filesystem entry points are intercepted:

builtins.open - Primary file open function
io.open - Alias for built-in open
pathlib.Path.open - pathlib file access
pathlib.Path.read_text, Path.read_bytes - Direct read methods
pathlib.Path.write_text, Path.write_bytes - Direct write methods
os.open, os.mkdir, os.remove, etc. - Low-level operations

Operation Categories¶

Filesystem operations are categorized as:

Operation	Description	Examples
READ	Read file contents	`open()` for reading, `Path.read_text()`
WRITE	Write file contents	`open()` for writing, `Path.write_text()`
CREATE	Create files/directories	`mkdir()`, `touch()`, `open('x')`
DELETE	Remove files/directories	`os.remove()`, `Path.unlink()`, `shutil.rmtree()`
MODIFY	Change file attributes	`chmod()`, `chown()`, `rename()`
STAT	Read file metadata	`stat()`, `exists()`, `is_file()`
LIST	List directory contents	`listdir()`, `scandir()`, `iterdir()`

All operations are blocked for small tests, including STAT operations. This ensures tests do not depend on external filesystem state.

Enabling Filesystem Isolation¶

Filesystem isolation is controlled by the test_categories_enforcement configuration option, the same option used for network isolation.

Configuration via pyproject.toml¶

[tool.pytest.ini_options]
# Enable filesystem and network isolation enforcement
test_categories_enforcement = "strict"

Configuration via pytest.ini¶

[pytest]
test_categories_enforcement = strict

Configuration via Command Line¶

pytest --test-categories-enforcement=strict

Enforcement Modes¶

The plugin supports three enforcement modes:

STRICT Mode¶

test_categories_enforcement = "strict"

In strict mode, filesystem violations immediately fail the test with a detailed error message:

============================================================
HermeticityViolationError
============================================================
Test: tests/test_reports.py::test_save_report
Category: SMALL
Violation: Filesystem access attempted

Details:
  Attempted write on: /home/user/project/output/report.txt

Small tests have restricted resource access. Options:
  1. Use pyfakefs for comprehensive filesystem mocking (pip install pyfakefs)
  2. Use io.StringIO or io.BytesIO for in-memory file-like objects
  3. Mock file operations using pytest-mock (mocker.patch("builtins.open", ...))
  4. Embed test data as Python constants or use importlib.resources
  5. Change test category to @pytest.mark.medium (if filesystem access is required)

Documentation: See docs/architecture/adr-002-filesystem-isolation.md
============================================================

Use strict mode in CI pipelines to catch violations before merge.

WARN Mode¶

test_categories_enforcement = "warn"

In warn mode, filesystem violations emit a warning but allow the test to continue:

PytestWarning: Filesystem access violation in test_save_report:
attempted write on /home/user/project/output/report.txt

Use warn mode during migration to identify violations without breaking the build.

OFF Mode¶

test_categories_enforcement = "off"

In off mode, filesystem isolation is disabled entirely. Use this for:

Legacy test suites not yet ready for enforcement
Specific test runs that require filesystem access
Debugging filesystem-related test issues

Understanding Error Messages¶

When a filesystem violation occurs, the error message provides:

Test identification: The full pytest node ID
Category: The test size (SMALL, MEDIUM, etc.)
Operation: The type of operation attempted (read, write, etc.)
Path: The path that was accessed
Remediation options: Specific suggestions for fixing the violation

Example Error Analysis¶

Attempted write on: /home/user/project/output/report.txt

This tells you:

The test tried to write a file (not just read)
The path is being blocked for this small test
You need to either mock the write, use pyfakefs, or upgrade to @pytest.mark.medium

Common Remediation Strategies¶

1. Use pyfakefs¶

For comprehensive filesystem mocking (recommended):

@pytest.mark.small
def test_with_fake_filesystem(fs):  # pyfakefs fixture
    fs.create_file("/etc/myapp/config.ini", contents="key=value")
    config = load_config("/etc/myapp/config.ini")
    assert config["key"] == "value"

2. Use io.StringIO or io.BytesIO¶

For tests that need file-like objects but not actual files:

from io import StringIO

@pytest.mark.small
def test_csv_writer():
    buffer = StringIO()
    write_csv(buffer, data)
    assert "header1,header2" in buffer.getvalue()

3. Mock File Operations¶

Use pytest-mock to intercept file operations:

@pytest.mark.small
def test_config_loader(mocker):
    mock_open = mocker.patch("builtins.open", mocker.mock_open(read_data="key=value"))
    config = load_config("/etc/myapp/config.ini")
    assert config["key"] == "value"

4. Embed Test Data¶

For read-only test data, embed it in your test code:

TEST_CONFIG = """
[database]
host = localhost
port = 5432
"""

@pytest.mark.small
def test_config_parser():
    config = parse_config(StringIO(TEST_CONFIG))
    assert config["database"]["host"] == "localhost"

5. Use importlib.resources¶

For package data files:

from importlib import resources

@pytest.mark.small
def test_load_schema():
    schema_text = resources.read_text("mypackage.schemas", "user.json")
    schema = json.loads(schema_text)
    assert "properties" in schema

6. Change Test Size¶

If the test legitimately requires filesystem access, change its category:

@pytest.mark.medium  # Medium tests can access filesystem
def test_large_file_processing(tmp_path):
    test_file = tmp_path / "dataset.csv"
    # ... setup and test

Best Practices¶

1. Start with WARN Mode¶

When first enabling filesystem isolation, use warn mode to identify all violations:

pytest --test-categories-enforcement=warn 2>&1 | grep "Filesystem access violation"

2. Fix Violations Systematically¶

Address violations in order of test frequency:

Fix small tests first (they run most often)
Use pyfakefs for tests that need filesystem semantics
Use io.StringIO/io.BytesIO for file-like objects
Change test size only when real filesystem access is essential

3. Use Dependency Injection¶

Design code to accept file paths or file-like objects as parameters:

# Production code
def save_report(data: dict, output: Path | TextIO) -> None:
    if isinstance(output, Path):
        output.write_text(json.dumps(data))
    else:
        output.write(json.dumps(data))

# Test code - small test with mock
@pytest.mark.small
def test_save_report_to_stream():
    buffer = StringIO()
    save_report({"key": "value"}, buffer)
    assert '"key"' in buffer.getvalue()

# Test code - medium test with real file
@pytest.mark.medium
def test_save_report_to_file(tmp_path):
    output_file = tmp_path / "report.json"
    save_report({"key": "value"}, output_file)
    assert output_file.exists()

4. Consider Test Size Carefully¶

If a test genuinely requires filesystem access, consider whether it belongs in a different size category:

Small: Pure functions, in-memory operations, mocked I/O (pyfakefs, io.StringIO)
Medium: File operations with tmp_path, config file parsing, local databases
Large: Integration with real filesystem paths, external services