Changelog#

To install the unreleased unihan-etl version, see developmental releases.

pip:

$ pip install --user --upgrade --pre unihan-etl

pipx:

$ pipx install --suffix=@next unihan-etl --pip-args '\--pre' --force
// Usage: [email protected]

unihan-etl 0.20.x (unreleased)#

  • Python 3.7 Dropped

    Python 3.7 support has been dropped (#272)

    Its end-of-life is June 27th, 2023 and Python 3.8 will add support for typing.TypedDict and typing.Protocol out of the box without needing typing_extensions.

unihan-etl 0.19.1 (2023-05-28)#

Maintenance only, no bug fixes or features

Development#

  • Add back black for formatting

    This is still necessary to accompany ruff, until it replaces black.

unihan-etl 0.19.0 (2023-05-27)#

Maintenance only, no bug fixes or features

Internal improvements#

  • Move formatting, import sorting, and linting to ruff.

    This rust-based checker has dramatically improved performance. Linting and formatting can be done almost instantly.

    This change replaces black, isort, flake8 and flake8 plugins.

  • poetry: 1.4.0 -> 1.5.0

    See also: https://github.com/python-poetry/poetry/releases/tag/1.5.0

  • pytest: Fix invalid escape sequence warning from zhon

Development#

  • merge_dict: Improve typing of generic params (#271)

unihan-etl 0.18.1 (2022-10-01)#

Packaging#

  • Add PyYAML dependency

Infrastructure#

  • CI speedups (#267)

    • Split out release to separate job so the PyPI Upload docker image isn’t pulled on normal runs

    • Clean up CodeQL

  • Bump poetry 1.1.x to 1.2.x

Packaging#

  • Move .coveragerc -> pyproject.toml (#268)

unihan-etl 0.18.0 (2022-09-11)#

Development#

Documentation#

unihan-etl 0.17.2 (2022-08-21)#

Documentation#

  • Add vendorized, updated fork of sphinxcontrib-issuetracker, via #261.

  • Remove sphinx-issues package

unihan-etl 0.17.1 (2022-08-21)#

Follow ups to #257.

Fixes#

  • merged_dict(): Fix merging edgecase where destination key was missing

  • download(): Fix edgecase when “downloading” file from local path

unihan-etl 0.17.0 (2022-08-21)#

Features#

  • mypy --strict annotations, via #257

unihan-etl 0.16.0 (2022-08-20)#

Features#

  • New option: --no-cache

    Disregard cached .zip / extracted files, via #259.

Development#

  • Add python 3.8 and 3.9 to CI

    This is to make way for strict type annotations, as the typings and generic behavior vary dramatically between 3.7 - 3.11.

unihan-etl 0.15.0 (2022-08-29)#

Breaking changes#

  • Python 2 compatibility module and imports removed. Python 2.x was officially dropped in 0.12.0 (2021-06-15) via #258

unihan-etl 0.14.0 (2022-08-16)#

Improvements#

  • load_data: Accept list of pathlib.Path in addition to list of str

Compatibility#

  • Add Python 3.10 (#248)

  • Dropped Python 3.6 (#248)

Development#

Infrastructure updates for static type checking and doctest examples.

  • Update poetry to 1.1

    • CI: Use poetry 1.1.12 and install-poetry.py installer (#237 + #248)

    • Relock poetry.lock at 1.1 (w/ 1.1.7’s fix)

  • Run pyupgrade for python 3.7

  • Tests: Move from tmpdir -> tmp_path

  • Initial doctests support added, via #255

  • Initial mypy validation, via #255

  • CI (tests, docs): Improve caching of python dependencies via action/setup-python’s v3/4’s new poetry caching, via #255

  • CI (docs): Skip if no PUBLISH condition triggered, via #255

Documentation#

  • Move to furo theme

  • Add :ref:quickstart page

  • Link to cihai’s developer documention: https://cihai.git-pull.com/contributing/

unihan-etl 0.13.0 (2021-06-16)#

  • #236: Convert to markdown

unihan-etl 0.12.0 (2021-06-15)#

  • Update {}black to 21.6b0

  • Update trove classifiers to 3.9

  • #235: Drop python 2.7, 3.5. Remove python 2 modesets and __future__

unihan-etl 0.11.0 (2020-08-09)#

  • #230 Move packaging / publishing to poetry

  • #229 Self host docs

  • #229 Add metadata / icons / etc. for doc site

  • #229 Move travis -> github actions

  • #229 Overhaul Makefiles

unihan-etl 0.10.4 (2020-08-05)#

  • Update CHANGES headings to produce working links

  • Relax appdirs version constraint

  • #228 Move from Pipfile to poetry

unihan-etl 0.10.3 (2019-08-18)#

  • Fix flicker in download progress bar

unihan-etl 0.10.2 (2019-08-17)#

  • Add project_urls to setup.py

  • Use plain reStructuredText for CHANGES

  • Use collections that’s compatible with python 2 and 3

  • PEP8 tweaks

unihan-etl 0.10.1 (2017-09-08)#

  • Add code links in API

  • Add __version__ to unihan_etl

unihan-etl 0.10.0 (2017-08-29)#

  • #91 New fields from UNIHAN Revision 25.

    • kJinmeiyoKanji

    • kJoyoKanji

    • kKoreanEducationHanja

    • kKoreanName

    • kTGH

    UNIHAN Revision 25 was released 2018-05-18 and issued for Unicode 11.0:

  • Add tests and example corpus for kCCCII

  • Add configuration / make tests for isort, flake8

  • Switch tmuxp config to use pipenv

  • Add Pipfile

  • Add make sync_pipfile task to sync requirements/.txt* files with *Pipfile*

  • Update and sync Pipfile

  • Developer package updates (linting / docs / testing)

    • isort 4.2.15 to 4.3.4

    • flake8 3.3.0 to 3.5.0

    • vulture 0.14 to 0.27

    • sphinx 1.6.2 to 1.7.6

    • alagitpull 0.0.12 to 0.0.21

    • releases 1.3.1 to 1.6.1

    • sphinx-argparse 0.2.1 to 1.6.2

    • pytest 3.1.2 to 3.6.4

  • Move documentation over to numpy-style

  • Add sphinxcontrib-napoleon 0.6.1

  • Update LICENSE New BSD to MIT

  • All future commits and contributions are licensed to the cihai software foundation. This includes commits by Tony Narlock (creator).

unihan-etl 0.9.5 (2017-06-26)#

  • Enhance support for locations on kHDZRadBreak fields.

unihan-etl 0.9.4 (2017-06-05)#

  • Fix kIRG_GSource without location

  • Fix kFenn output

  • Fix kHanyuPinlu support output for n diacritics

unihan-etl 0.9.3 (2017-05-31)#

  • Add expansion for kIRGKangXi

unihan-etl 0.9.2 (2017-05-31)#

  • Normalize Radical-Stroke expansion for kRSUnicode

  • Migrate more fields to regular expressions

  • Normalize character field for kDaeJaweon, kHanyuPinyin, and kCheungBauer, kFennIndex, kCheungBauerIndex, kIICore, kIRGHanyuDaZidian

unihan-etl 0.9.1 (2017-05-27)#

  • Support for expanding kGSR

  • Convert some field expansions to use regexes

unihan-etl 0.9.0 (2017-05-26)#

  • Fix bug where destination file was made into directory on first run

  • Rename from unihan-tabular to unihan-etl

  • Support for expanding multi-value fields

  • Support for pruning empty fields

  • Improve help dialog

  • Added a page about UNIHAN and the project to documentation

  • Split constant values into their own module

  • Split functionality for expanding unstructured values into its own module

unihan-etl 0.8.1 (2017-05-20)#

  • Update to add kJa and adjust source file of kCompatibilityVariant per Unicode 8.0.0.

unihan-etl 0.8.0 (2017-05-17)#

  • Support for configuring logging via options and CLI

  • Convert all print statements to use logger

unihan-etl 0.7.4 (2017-05-14)#

  • Allow for local / file system sources for Unihan.zip

  • Only extract zip if unextracted

unihan-etl 0.7.3 (2017-05-13)#

  • Update package classifiers

unihan-etl 0.7.2 (2017-05-13)#

  • Add back datapackage

unihan-etl 0.7.1 (2017-05-12)#

  • Fix python 2 CSV output

  • Default to CSV output

unihan-etl 0.7.0 (2017-05-12)#

  • Move unicodecsv module to dependency package

  • Support for XDG directory specification

  • Support for custom destination output, including replacing template variable {ext}

unihan-etl 0.6.3 (2017-05-11)#

  • Move about.py to module level

unihan-etl 0.6.2 (2017-05-11)#

  • Fix python package import

unihan-etl 0.6.1 (2017-05-10)#

  • Fix readme bug on pypi

unihan-etl 0.6.0 (2017-05-10)#

  • Support for exporting in YAML and JSON

  • More internal factoring and simplification

  • Return data as list

unihan-etl 0.5.1 (2017-05-08)#

  • Drop python 3.3 an 3.4 support

unihan-etl 0.5.0 (2017-05-08)#

  • Rename from cihaidata_unihan unihan_tabular

  • Drop datapackages in favor of a universal JSON, YAML and CSV export.

  • Only use UnicodeWriter in Python 2, fixes issue with python would encode {}b in front of values

unihan-etl 0.4.2 (2017-05-07)#

  • Rename scripts/ to cihaidata_unihan/

unihan-etl 0.4.1 (2017-05-07)#

  • Enable invoking tool via $ cihaidata_unihan

unihan-etl 0.4.0 (2017-05-07)#

  • Major internal refactor and simplification

  • Convert to pytest assert statements

  • Convert full test suite to pytest functions and fixtures

  • Get CLI documentation up again

  • Improve test coverage

  • Lint code, remove unused imports

  • Switch license BSD -> MIT

unihan-etl 0.3.0 (2017-04-17)#

  • Rebooted

  • Modernize Makefile in docs

  • Add Makefile to main project

  • Modernize package metadata to use about.py

  • Update requirements to use requirements/ folder for base, testing and doc dependencies.

  • Update sphinx theme to alabaster with new logo.

  • Update travis to use coverall

  • Update links on README to use https

  • Update travis to test up to python 3.6

  • Add support for pypy (why not)

  • Lock base dependencies

  • Add dev dependencies for isort, vulture and flake8