pytest plugin¶
unihan-etl ships a pytest plugin that downloads UNIHAN.zip once and reuses
it across tests, plus an isolated home directory for cache and config setup.
The plugin auto-discovers via the pytest11 entry point — installing
unihan-etl is enough to make every fixture below available in your tests.
See the test suite
for usage examples.
Quick Start¶
Add a fixture name as a test parameter — pytest creates and injects it automatically. You never call fixtures yourself.
def test_quick_packager(unihan_quick_packager) -> None:
unihan_quick_packager.download()
unihan_quick_packager.export()
assert unihan_quick_packager.options.destination.exists()
def test_with_raw_snippet(unihan_quick_data: str) -> None:
assert "kCantonese" in unihan_quick_data
Which Fixture Do I Need?¶
Use
unihan_quick_packagerwhen you want a small, fast UNIHAN dataset for unit tests.Use
unihan_full_packagerwhen you need the complete UNIHAN corpus.Use
unihan_bootstrap_all(autouse-wrapped) when you want both datasets pre-downloaded at session start.Use
unihan_quick_datawhen you only need a raw text snippet rather than a fully bootstrapped Packager.Override
unihan_cache_path(orunihan_project_cache_path) to redirect where cached UNIHAN data lives.Override
unihan_home_user_namewhen you need a custom test user identity.
Dataset Bootstrap¶
The primary injection points for tests that need a working UNIHAN dataset.
Bootstrap a small, but effective portion of UNIHAN, return a UnihanOptions.
Note
Created once per test session and shared across all tests. Requesting this fixture does not create a new instance per test.
- Depends on:
- Used by:
Example
def test_quick(unihan_quick_packager) -> None: unihan_quick_packager.download() unihan_quick_packager.export() assert unihan_quick_packager.options.destination.exists()
Return Packager for “full” portion of UNIHAN, return a UnihanOptions.
Note
Created once per test session and shared across all tests. Requesting this fixture does not create a new instance per test.
- Depends on:
- Used by:
Return a small, but effective portion of UNIHAN, return a UnihanOptions.
Note
Created once per test session and shared across all tests. Requesting this fixture does not create a new instance per test.
- Depends on:
unihan_quick_path,unihan_quick_options,unihan_quick_packager- Used by:
>>> import pathlib
>>> from unihan_etl.core import Packager >>> from unihan_etl.options import Options as UnihanOptions
>>> def test_unihan_ensure_quick( ... unihan_quick_path: pathlib.Path, ... unihan_quick_options: "UnihanOptions", ... unihan_quick_packager: "Packager", ... ) -> None: ... unihan_quick_destination = unihan_quick_options.destination ... assert unihan_quick_destination.exists() ... assert unihan_quick_destination.stat().st_size >= 140_000 ... assert unihan_quick_destination.stat().st_size < 200_000 ... ... assert unihan_quick_options.work_dir.exists() ... unihan_readings = unihan_quick_options.work_dir / 'Unihan_Readings.txt' ... assert unihan_readings.stat().st_size >= 21_631 ... assert unihan_readings.stat().st_size < 30_000
Extending fixtures:
>>> import pathlib
>>> import pytest
>>> from unihan_etl.core import Packager >>> from unihan_etl.options import Options as UnihanOptions
>>> @pytest.fixture ... def my_unihan( ... unihan_quick_path: pathlib.Path, ... unihan_quick_options: "UnihanOptions", ... unihan_quick_packager: "Packager", ... ) -> "Packager": ... return unihan_quick_packager
>>> def test_my_extended_unihan_Fixture(my_unihan: "Packager") -> None: ... my_unihan.download() ... my_unihan_destination = my_unihan.options.destination ... if not my_unihan_destination.exists(): ... my_unihan.export() ... assert my_unihan_destination.exists() ... assert my_unihan_destination.stat().st_size >= 140_000 ... assert my_unihan_destination.stat().st_size < 200_000 ... ... assert my_unihan.options.work_dir.exists() ... unihan_readings = my_unihan.options.work_dir / 'Unihan_Readings.txt' ... assert unihan_readings.stat().st_size >= 21_000 ... assert unihan_readings.stat().st_size < 30_000
Download and extract “full” UNIHAN, return UnihanOptions.
Note
Created once per test session and shared across all tests. Requesting this fixture does not create a new instance per test.
- Depends on:
- Used by:
>>> import pathlib
>>> from unihan_etl.core import Packager >>> from unihan_etl.options import Options as UnihanOptions
>>> def test_unihan_ensure_full( ... unihan_full_path: pathlib.Path, ... unihan_full_options: "UnihanOptions", ... unihan_full_packager: "Packager", ... ) -> None: ... unihan_full_destination = unihan_full_options.destination ... assert unihan_full_destination.exists() ... assert unihan_full_destination.stat().st_size > 20_000_000 ... ... assert unihan_full_options.work_dir.exists() ... unihan_readings = unihan_full_options.work_dir / 'Unihan_Readings.txt' ... assert unihan_readings.stat().st_size > 6_200_000
Extending fixtures:
>>> import pathlib
>>> import pytest
>>> from unihan_etl.core import Packager >>> from unihan_etl.options import Options as UnihanOptions
>>> @pytest.fixture ... def my_unihan( ... unihan_full_path: pathlib.Path, ... unihan_full_options: "UnihanOptions", ... unihan_full_packager: "Packager", ... ) -> "Packager": ... return unihan_full_packager
>>> def test_my_extended_unihan_Fixture(my_unihan: "Packager") -> None: ... my_unihan.download() ... my_unihan_destination = my_unihan.options.destination ... if not my_unihan_destination.exists(): ... my_unihan.export() ... assert my_unihan_destination.exists() ... assert my_unihan_destination.stat().st_size > 20_000_000 ... ... assert my_unihan.options.work_dir.exists() ... unihan_readings = my_unihan.options.work_dir / 'Unihan_Readings.txt' ... assert unihan_readings.stat().st_size > 6_200_000
Noop that bootstraps all unihan_etl pytest datasets (“full” and “quick”).
Note
Created once per test session and shared across all tests. Requesting this fixture does not create a new instance per test.
- Depends on:
This should be used like so in your project’s conftest.py:
>>> import pytest >>> @pytest.fixture(scope="session", autouse=True) ... def bootstrap(unihan_bootstrap_all) -> None: ... return None
Example
# conftest.py import pytest @pytest.fixture(scope="session", autouse=True) def bootstrap(unihan_bootstrap_all) -> None: return None
Dataset Options & Paths¶
Session-scoped fixtures exposing the dataset filesystem layout and the
Options objects that drive the Packager.
Return UnihanOptions for “quick” test data set.
Note
Created once per test session and shared across all tests. Requesting this fixture does not create a new instance per test.
- Depends on:
- Used by:
Return UnihanOptions for “full” UNIHAN dataset.
Note
Created once per test session and shared across all tests. Requesting this fixture does not create a new instance per test.
- Depends on:
- Used by:
Return directory path for “quick” test data set.
Note
Created once per test session and shared across all tests. Requesting this fixture does not create a new instance per test.
- Depends on:
- Used by:
unihan_ensure_quick,unihan_quick_options,unihan_quick_packager,unihan_quick_zip,unihan_quick_zip_path
Return directory path for “full” UNIHAN dataset.
Note
Created once per test session and shared across all tests. Requesting this fixture does not create a new instance per test.
- Depends on:
- Used by:
unihan_ensure_full,unihan_full_options,unihan_full_packager
Return zip file path for “quick” test data set.
Note
Created once per test session and shared across all tests. Requesting this fixture does not create a new instance per test.
- Depends on:
- Used by:
Return zip file for “quick” test data set.
Note
Created once per test session and shared across all tests. Requesting this fixture does not create a new instance per test.
- Depends on:
unihan_quick_path,unihan_quick_zip_path,unihan_quick_fixture_files- Used by:
Raw Data Accessors¶
Lower-level fixtures for tests that need to inspect or transform UNIHAN data without invoking the full Packager pipeline.
Raw snippet excerpted from UNIHAN corpus from “quick” test data.
Note
Created once per test session and shared across all tests. Requesting this fixture does not create a new instance per test.
>>> def test_unihan_quick_data( ... unihan_quick_data: str, ... ) -> None: ... assert isinstance(unihan_quick_data, str) ... ... assert isinstance(unihan_quick_data.splitlines()[1], str) ...
- Used by:
Return files used in “quick” test data set.
Note
Created once per test session and shared across all tests. Requesting this fixture does not create a new instance per test.
- Used by:
Return columns used in “quick” test data set.
Note
Created once per test session and shared across all tests. Requesting this fixture does not create a new instance per test.
- Used by:
Return normalized test data from “quick” test data set.
Note
Created once per test session and shared across all tests. Requesting this fixture does not create a new instance per test.
- Depends on:
- Used by:
Return a list of expanded fields from “quick” test data.
Note
Created once per test session and shared across all tests. Requesting this fixture does not create a new instance per test.
- Depends on:
Mock Zip Fixtures¶
Build a synthetic Unihan.zip on disk for tests that exercise the download/extract
path without hitting the real corpus.
Return Unihan zipfile.
Note
Created once per test session and shared across all tests. Requesting this fixture does not create a new instance per test.
- Depends on:
Return path to Unihan zipfile.
Note
Created once per test session and shared across all tests. Requesting this fixture does not create a new instance per test.
- Depends on:
- Used by:
Return zip file name in “quick” test data set.
Note
Created once per test session and shared across all tests. Requesting this fixture does not create a new instance per test.
- Used by:
Return temporary directory for unihan_etl py.test fixtures.
Note
Created once per test session and shared across all tests. Requesting this fixture does not create a new instance per test.
- Depends on:
- Used by:
Cache Paths (Override Hooks)¶
Override these in your project’s conftest.py to redirect where unihan-etl caches
downloaded archives, extracted files, and intermediate fixture state.
unihan-etl cache directory, overridable.
Note
Created once per test session and shared across all tests. Requesting this fixture does not create a new instance per test.
Tip
This is an override hook. Override it in your project’s conftest.py to customise behaviour for your test suite.
# conftest.py import pytest @pytest.fixture(scope="session") def unihan_user_cache_path() -> ~pathlib.Path: return ... # your value here
Return unihan_etl project-based cache path. Override to path of your choice.
Note
Created once per test session and shared across all tests. Requesting this fixture does not create a new instance per test.
Tip
This is an override hook. Override it in your project’s conftest.py to customise behaviour for your test suite.
# conftest.py import pytest @pytest.fixture(scope="session") def unihan_project_cache_path() -> ~pathlib.Path: return ... # your value here
- Used by:
Return unihan_etl cache path, override this to destination of your choice.
Note
Created once per test session and shared across all tests. Requesting this fixture does not create a new instance per test.
Tip
This is an override hook. Override it in your project’s conftest.py to customise behaviour for your test suite.
- Depends on:
- Used by:
Example
# conftest.py import pathlib import pytest @pytest.fixture(scope="session") def unihan_cache_path(tmp_path_factory: pytest.TempPathFactory) -> pathlib.Path: return tmp_path_factory.mktemp("unihan-cache")
Return pytest cached directory fixture root.
Note
Created once per test session and shared across all tests. Requesting this fixture does not create a new instance per test.
Tip
This is an override hook. Override it in your project’s conftest.py to customise behaviour for your test suite.
- Depends on:
- Used by:
# conftest.py import pytest @pytest.fixture(scope="session") def unihan_fixture_root() -> ~pathlib.Path: return ... # your value here
Home & User Environment¶
Create an isolated filesystem home for the duration of the test session. Override
unihan_home_user_name to control the user identity.
Return temporary /home/ path for use by unihan_etl pytest fixtures.
Note
Created once per test session and shared across all tests. Requesting this fixture does not create a new instance per test.
- Depends on:
- Used by:
Return username to set for
unihan_user_path()fixture.Note
Created once per test session and shared across all tests. Requesting this fixture does not create a new instance per test.
Tip
This is an override hook. Override it in your project’s conftest.py to customise behaviour for your test suite.
Example
# conftest.py import pytest @pytest.fixture(scope="session") def unihan_home_user_name() -> str: return "ci-runner"
- Used by:
Return temporary user directory.
Note
Created once per test session and shared across all tests. Requesting this fixture does not create a new instance per test.
- Depends on:
- Used by:
Used by:
unihan_zshrc()Note: You will need to set the home directory, see set_home.
Suppress ZSH default message.
Note
Created once per test session and shared across all tests. Requesting this fixture does not create a new instance per test.
- Depends on:
Needs a startup file .zshenv, .zprofile, .unihan_zshrc, .zlogin.
Function-Scoped Helpers¶
Types¶
-
unihan_etl.pytest_plugin.UnihanTestOptions: TypeAlias = unihan_etl.options.Options | collections.abc.Mapping[str, typing.Any]¶unihan_etl.pytest_plugin.UnihanTestOptions: TypeAlias = unihan_etl.options.Options | collections.abc.Mapping[str, typing.Any]¶
Options accepted by
unihan_test_options.Either a fully-configured
Optionsdataclass or a plain mapping of keyword arguments.
Configuration¶
These conf.py values control how fixture documentation is rendered:
Fixture names to suppress from “Depends on” lists. Default: common pytest builtins (
pytestconfig,capfd,capsysbinary,capfdbinary,recwarn,tmpdir,pytester,testdir,record_property,record_xml_attribute,record_testsuite_property,cache).
-
pytest_fixture_builtin_links¶pytest_fixture_builtin_links¶
URL mapping for builtin fixture external links in “Depends on” blocks. Default: links to pytest docs for
tmp_path_factory,tmp_path,monkeypatch,request,capsys,caplog.
-
pytest_external_fixture_links¶pytest_external_fixture_links¶
URL mapping for external fixture cross-references. Default:
{}.
Note
All fixtures above are also auto-discoverable via:
.. autofixtures:: unihan_etl.pytest_plugin
:order: source
Use autofixtures:: in your own plugin docs to document every fixture from a
module without listing each one manually.