pytest plugin

unihan-etl ships a pytest plugin that downloads UNIHAN.zip once and reuses it across tests, plus an isolated home directory for cache and config setup. The plugin auto-discovers via the pytest11 entry point — installing unihan-etl is enough to make every fixture below available in your tests. See the test suite for usage examples.

Quick Start

Add a fixture name as a test parameter — pytest creates and injects it automatically. You never call fixtures yourself.

def test_quick_packager(unihan_quick_packager) -> None:
    unihan_quick_packager.download()
    unihan_quick_packager.export()
    assert unihan_quick_packager.options.destination.exists()


def test_with_raw_snippet(unihan_quick_data: str) -> None:
    assert "kCantonese" in unihan_quick_data

Which Fixture Do I Need?


Dataset Bootstrap

The primary injection points for tests that need a working UNIHAN dataset.

fixture unihan_etl.pytest_plugin.unihan_quick_packager Packager
session fixture[source]
session fixture[source]
fixture unihan_etl.pytest_plugin.unihan_quick_packager Packager

Bootstrap a small, but effective portion of UNIHAN, return a UnihanOptions.

Note

Created once per test session and shared across all tests. Requesting this fixture does not create a new instance per test.

Example

def test_quick(unihan_quick_packager) -> None:
    unihan_quick_packager.download()
    unihan_quick_packager.export()
    assert unihan_quick_packager.options.destination.exists()
fixture unihan_etl.pytest_plugin.unihan_full_packager Packager
session fixture[source]
session fixture[source]
fixture unihan_etl.pytest_plugin.unihan_full_packager Packager

Return Packager for “full” portion of UNIHAN, return a UnihanOptions.

Note

Created once per test session and shared across all tests. Requesting this fixture does not create a new instance per test.

fixture unihan_etl.pytest_plugin.unihan_ensure_quick None
session fixture[source]
session fixture[source]
fixture unihan_etl.pytest_plugin.unihan_ensure_quick None

Return a small, but effective portion of UNIHAN, return a UnihanOptions.

Note

Created once per test session and shared across all tests. Requesting this fixture does not create a new instance per test.

>>> import pathlib
>>> from unihan_etl.core import Packager
>>> from unihan_etl.options import Options as UnihanOptions
>>> def test_unihan_ensure_quick(
...     unihan_quick_path: pathlib.Path,
...     unihan_quick_options: "UnihanOptions",
...     unihan_quick_packager: "Packager",
... ) -> None:
...     unihan_quick_destination = unihan_quick_options.destination
...     assert unihan_quick_destination.exists()
...     assert unihan_quick_destination.stat().st_size >= 140_000
...     assert unihan_quick_destination.stat().st_size < 200_000
...
...     assert unihan_quick_options.work_dir.exists()
...     unihan_readings = unihan_quick_options.work_dir / 'Unihan_Readings.txt'
...     assert unihan_readings.stat().st_size >= 21_631
...     assert unihan_readings.stat().st_size < 30_000

Extending fixtures:

>>> import pathlib
>>> import pytest
>>> from unihan_etl.core import Packager
>>> from unihan_etl.options import Options as UnihanOptions
>>> @pytest.fixture
... def my_unihan(
...     unihan_quick_path: pathlib.Path,
...     unihan_quick_options: "UnihanOptions",
...     unihan_quick_packager: "Packager",
... ) -> "Packager":
...     return unihan_quick_packager
>>> def test_my_extended_unihan_Fixture(my_unihan: "Packager") -> None:
...     my_unihan.download()
...     my_unihan_destination = my_unihan.options.destination
...     if not my_unihan_destination.exists():
...         my_unihan.export()
...     assert my_unihan_destination.exists()
...     assert my_unihan_destination.stat().st_size >= 140_000
...     assert my_unihan_destination.stat().st_size < 200_000
...
...     assert my_unihan.options.work_dir.exists()
...     unihan_readings = my_unihan.options.work_dir / 'Unihan_Readings.txt'
...     assert unihan_readings.stat().st_size >= 21_000
...     assert unihan_readings.stat().st_size < 30_000
fixture unihan_etl.pytest_plugin.unihan_ensure_full None
session fixture[source]
session fixture[source]
fixture unihan_etl.pytest_plugin.unihan_ensure_full None

Download and extract “full” UNIHAN, return UnihanOptions.

Note

Created once per test session and shared across all tests. Requesting this fixture does not create a new instance per test.

>>> import pathlib
>>> from unihan_etl.core import Packager
>>> from unihan_etl.options import Options as UnihanOptions
>>> def test_unihan_ensure_full(
...     unihan_full_path: pathlib.Path,
...     unihan_full_options: "UnihanOptions",
...     unihan_full_packager: "Packager",
... ) -> None:
...     unihan_full_destination = unihan_full_options.destination
...     assert unihan_full_destination.exists()
...     assert unihan_full_destination.stat().st_size > 20_000_000
...
...     assert unihan_full_options.work_dir.exists()
...     unihan_readings = unihan_full_options.work_dir / 'Unihan_Readings.txt'
...     assert unihan_readings.stat().st_size > 6_200_000

Extending fixtures:

>>> import pathlib
>>> import pytest
>>> from unihan_etl.core import Packager
>>> from unihan_etl.options import Options as UnihanOptions
>>> @pytest.fixture
... def my_unihan(
...     unihan_full_path: pathlib.Path,
...     unihan_full_options: "UnihanOptions",
...     unihan_full_packager: "Packager",
... ) -> "Packager":
...     return unihan_full_packager
>>> def test_my_extended_unihan_Fixture(my_unihan: "Packager") -> None:
...     my_unihan.download()
...     my_unihan_destination = my_unihan.options.destination
...     if not my_unihan_destination.exists():
...         my_unihan.export()
...     assert my_unihan_destination.exists()
...     assert my_unihan_destination.stat().st_size > 20_000_000
...
...     assert my_unihan.options.work_dir.exists()
...     unihan_readings = my_unihan.options.work_dir / 'Unihan_Readings.txt'
...     assert unihan_readings.stat().st_size > 6_200_000
fixture unihan_etl.pytest_plugin.unihan_bootstrap_all None
session fixture[source]
session fixture[source]
fixture unihan_etl.pytest_plugin.unihan_bootstrap_all None

Noop that bootstraps all unihan_etl pytest datasets (“full” and “quick”).

Note

Created once per test session and shared across all tests. Requesting this fixture does not create a new instance per test.

This should be used like so in your project’s conftest.py:

>>> import pytest
>>> @pytest.fixture(scope="session", autouse=True)
... def bootstrap(unihan_bootstrap_all) -> None:
...     return None

Example

# conftest.py
import pytest


@pytest.fixture(scope="session", autouse=True)
def bootstrap(unihan_bootstrap_all) -> None:
    return None

Dataset Options & Paths

Session-scoped fixtures exposing the dataset filesystem layout and the Options objects that drive the Packager.

fixture unihan_etl.pytest_plugin.unihan_quick_options Options
session fixture[source]
session fixture[source]
fixture unihan_etl.pytest_plugin.unihan_quick_options Options

Return UnihanOptions for “quick” test data set.

Note

Created once per test session and shared across all tests. Requesting this fixture does not create a new instance per test.

fixture unihan_etl.pytest_plugin.unihan_full_options Options
session fixture[source]
session fixture[source]
fixture unihan_etl.pytest_plugin.unihan_full_options Options

Return UnihanOptions for “full” UNIHAN dataset.

Note

Created once per test session and shared across all tests. Requesting this fixture does not create a new instance per test.

fixture unihan_etl.pytest_plugin.unihan_quick_path Path
session fixture[source]
session fixture[source]
fixture unihan_etl.pytest_plugin.unihan_quick_path Path

Return directory path for “quick” test data set.

Note

Created once per test session and shared across all tests. Requesting this fixture does not create a new instance per test.

fixture unihan_etl.pytest_plugin.unihan_full_path Path
session fixture[source]
session fixture[source]
fixture unihan_etl.pytest_plugin.unihan_full_path Path

Return directory path for “full” UNIHAN dataset.

Note

Created once per test session and shared across all tests. Requesting this fixture does not create a new instance per test.

fixture unihan_etl.pytest_plugin.unihan_quick_zip_path Path
session fixture[source]
session fixture[source]
fixture unihan_etl.pytest_plugin.unihan_quick_zip_path Path

Return zip file path for “quick” test data set.

Note

Created once per test session and shared across all tests. Requesting this fixture does not create a new instance per test.

fixture unihan_etl.pytest_plugin.unihan_quick_zip ZipFile
session fixture[source]
session fixture[source]
fixture unihan_etl.pytest_plugin.unihan_quick_zip ZipFile

Return zip file for “quick” test data set.

Note

Created once per test session and shared across all tests. Requesting this fixture does not create a new instance per test.

Raw Data Accessors

Lower-level fixtures for tests that need to inspect or transform UNIHAN data without invoking the full Packager pipeline.

fixture unihan_etl.pytest_plugin.unihan_quick_data str
session fixture[source]
session fixture[source]
fixture unihan_etl.pytest_plugin.unihan_quick_data str

Raw snippet excerpted from UNIHAN corpus from “quick” test data.

Note

Created once per test session and shared across all tests. Requesting this fixture does not create a new instance per test.

>>> def test_unihan_quick_data(
...     unihan_quick_data: str,
... ) -> None:
...     assert isinstance(unihan_quick_data, str)
...
...     assert isinstance(unihan_quick_data.splitlines()[1], str)
...
Used by:

unihan_mock_zip

fixture unihan_etl.pytest_plugin.unihan_quick_fixture_files list[Path]
session fixture[source]
session fixture[source]
fixture unihan_etl.pytest_plugin.unihan_quick_fixture_files list[Path]

Return files used in “quick” test data set.

Note

Created once per test session and shared across all tests. Requesting this fixture does not create a new instance per test.

fixture unihan_etl.pytest_plugin.unihan_quick_columns ColumnData
session fixture[source]
session fixture[source]
fixture unihan_etl.pytest_plugin.unihan_quick_columns ColumnData

Return columns used in “quick” test data set.

Note

Created once per test session and shared across all tests. Requesting this fixture does not create a new instance per test.

fixture unihan_etl.pytest_plugin.unihan_quick_normalized_data UntypedNormalizedData
session fixture[source]
session fixture[source]
fixture unihan_etl.pytest_plugin.unihan_quick_normalized_data UntypedNormalizedData

Return normalized test data from “quick” test data set.

Note

Created once per test session and shared across all tests. Requesting this fixture does not create a new instance per test.

fixture unihan_etl.pytest_plugin.unihan_quick_expanded_data ExpandedExport
session fixture[source]
session fixture[source]
fixture unihan_etl.pytest_plugin.unihan_quick_expanded_data ExpandedExport

Return a list of expanded fields from “quick” test data.

Note

Created once per test session and shared across all tests. Requesting this fixture does not create a new instance per test.

Mock Zip Fixtures

Build a synthetic Unihan.zip on disk for tests that exercise the download/extract path without hitting the real corpus.

fixture unihan_etl.pytest_plugin.unihan_mock_zip ZipFile
session fixture[source]
session fixture[source]
fixture unihan_etl.pytest_plugin.unihan_mock_zip ZipFile

Return Unihan zipfile.

Note

Created once per test session and shared across all tests. Requesting this fixture does not create a new instance per test.

fixture unihan_etl.pytest_plugin.unihan_mock_zip_path Path
session fixture[source]
session fixture[source]
fixture unihan_etl.pytest_plugin.unihan_mock_zip_path Path

Return path to Unihan zipfile.

Note

Created once per test session and shared across all tests. Requesting this fixture does not create a new instance per test.

fixture unihan_etl.pytest_plugin.unihan_mock_zip_pathname str
session fixture[source]
session fixture[source]
fixture unihan_etl.pytest_plugin.unihan_mock_zip_pathname str

Return zip file name in “quick” test data set.

Note

Created once per test session and shared across all tests. Requesting this fixture does not create a new instance per test.

fixture unihan_etl.pytest_plugin.unihan_mock_test_dir Path
session fixture[source]
session fixture[source]
fixture unihan_etl.pytest_plugin.unihan_mock_test_dir Path

Return temporary directory for unihan_etl py.test fixtures.

Note

Created once per test session and shared across all tests. Requesting this fixture does not create a new instance per test.

Depends on:

tmp_path_factory

Used by:

unihan_mock_zip_path

Cache Paths (Override Hooks)

Override these in your project’s conftest.py to redirect where unihan-etl caches downloaded archives, extracted files, and intermediate fixture state.

fixture unihan_etl.pytest_plugin.unihan_user_cache_path Path
session override fixture[source]
session override fixture[source]
fixture unihan_etl.pytest_plugin.unihan_user_cache_path Path

unihan-etl cache directory, overridable.

Note

Created once per test session and shared across all tests. Requesting this fixture does not create a new instance per test.

Tip

This is an override hook. Override it in your project’s conftest.py to customise behaviour for your test suite.

# conftest.py
import pytest


@pytest.fixture(scope="session")
def unihan_user_cache_path() -> ~pathlib.Path:
    return ...  # your value here
fixture unihan_etl.pytest_plugin.unihan_project_cache_path Path
session override fixture[source]
session override fixture[source]
fixture unihan_etl.pytest_plugin.unihan_project_cache_path Path

Return unihan_etl project-based cache path. Override to path of your choice.

Note

Created once per test session and shared across all tests. Requesting this fixture does not create a new instance per test.

Tip

This is an override hook. Override it in your project’s conftest.py to customise behaviour for your test suite.

# conftest.py
import pytest


@pytest.fixture(scope="session")
def unihan_project_cache_path() -> ~pathlib.Path:
    return ...  # your value here
fixture unihan_etl.pytest_plugin.unihan_cache_path Path
session override fixture[source]
session override fixture[source]
fixture unihan_etl.pytest_plugin.unihan_cache_path Path

Return unihan_etl cache path, override this to destination of your choice.

Note

Created once per test session and shared across all tests. Requesting this fixture does not create a new instance per test.

Tip

This is an override hook. Override it in your project’s conftest.py to customise behaviour for your test suite.

Example

# conftest.py
import pathlib
import pytest


@pytest.fixture(scope="session")
def unihan_cache_path(tmp_path_factory: pytest.TempPathFactory) -> pathlib.Path:
    return tmp_path_factory.mktemp("unihan-cache")
fixture unihan_etl.pytest_plugin.unihan_fixture_root Path
session override fixture[source]
session override fixture[source]
fixture unihan_etl.pytest_plugin.unihan_fixture_root Path

Return pytest cached directory fixture root.

Note

Created once per test session and shared across all tests. Requesting this fixture does not create a new instance per test.

Tip

This is an override hook. Override it in your project’s conftest.py to customise behaviour for your test suite.

# conftest.py
import pytest


@pytest.fixture(scope="session")
def unihan_fixture_root() -> ~pathlib.Path:
    return ...  # your value here

Home & User Environment

Create an isolated filesystem home for the duration of the test session. Override unihan_home_user_name to control the user identity.

fixture unihan_etl.pytest_plugin.unihan_home_path Path
session fixture[source]
session fixture[source]
fixture unihan_etl.pytest_plugin.unihan_home_path Path

Return temporary /home/ path for use by unihan_etl pytest fixtures.

Note

Created once per test session and shared across all tests. Requesting this fixture does not create a new instance per test.

Depends on:

tmp_path_factory

Used by:

unihan_user_path

fixture unihan_etl.pytest_plugin.unihan_home_user_name str
session override fixture[source]
session override fixture[source]
fixture unihan_etl.pytest_plugin.unihan_home_user_name str

Return username to set for unihan_user_path() fixture.

Note

Created once per test session and shared across all tests. Requesting this fixture does not create a new instance per test.

Tip

This is an override hook. Override it in your project’s conftest.py to customise behaviour for your test suite.

Example

# conftest.py
import pytest


@pytest.fixture(scope="session")
def unihan_home_user_name() -> str:
    return "ci-runner"
Used by:

unihan_user_path

fixture unihan_etl.pytest_plugin.unihan_user_path Path
session fixture[source]
session fixture[source]
fixture unihan_etl.pytest_plugin.unihan_user_path Path

Return temporary user directory.

Note

Created once per test session and shared across all tests. Requesting this fixture does not create a new instance per test.

Used by: unihan_zshrc()

Note: You will need to set the home directory, see set_home.

fixture unihan_etl.pytest_plugin.unihan_zshrc Path
session fixture[source]
session fixture[source]
fixture unihan_etl.pytest_plugin.unihan_zshrc Path

Suppress ZSH default message.

Note

Created once per test session and shared across all tests. Requesting this fixture does not create a new instance per test.

Depends on:

unihan_user_path

Needs a startup file .zshenv, .zprofile, .unihan_zshrc, .zlogin.

Function-Scoped Helpers

fixture unihan_etl.pytest_plugin.unihan_test_options UnihanTestOptions
fixture[source]
fixture[source]
fixture unihan_etl.pytest_plugin.unihan_test_options UnihanTestOptions

Return UnihanOptions for test data.


Types

unihan_etl.pytest_plugin.UnihanTestOptions: TypeAlias = unihan_etl.options.Options | collections.abc.Mapping[str, typing.Any]
data
data
unihan_etl.pytest_plugin.UnihanTestOptions: TypeAlias = unihan_etl.options.Options | collections.abc.Mapping[str, typing.Any]

Options accepted by unihan_test_options.

Either a fully-configured Options dataclass or a plain mapping of keyword arguments.


Configuration

These conf.py values control how fixture documentation is rendered:

pytest_fixture_hidden_dependencies
pytest_fixture_hidden_dependencies

Fixture names to suppress from “Depends on” lists. Default: common pytest builtins (pytestconfig, capfd, capsysbinary, capfdbinary, recwarn, tmpdir, pytester, testdir, record_property, record_xml_attribute, record_testsuite_property, cache).

URL mapping for builtin fixture external links in “Depends on” blocks. Default: links to pytest docs for tmp_path_factory, tmp_path, monkeypatch, request, capsys, caplog.

URL mapping for external fixture cross-references. Default: {}.


Note

All fixtures above are also auto-discoverable via:

.. autofixtures:: unihan_etl.pytest_plugin
   :order: source

Use autofixtures:: in your own plugin docs to document every fixture from a module without listing each one manually.