Changelog¶
To install the unreleased unihan-etl version, see developmental releases.
pip:
$ pip install --user --upgrade --pre unihan-etl
pipx:
$ pipx install --suffix=@next unihan-etl --pip-args '\--pre' --force
// Usage: unihan-etl@next
unihan-etl 0.37.x (unreleased)¶
unihan-etl 0.36.0 (2024-11-26)¶
Maintenance release: No bug fixes or new features.
Breaking changes¶
Project and package management: poetry to uv (#329)¶
uv is the new package and project manager for the project, replacing Poetry.
Build system: poetry to hatchling (#329)¶
Build system moved from poetry to hatchling.
unihan-etl 0.35.0 (2024-11-25)¶
Documentation¶
Automatically linkify links that were previously only text.
Development¶
poetry: 1.8.1 -> 1.8.2
See also: https://github.com/python-poetry/poetry/blob/1.8.2/CHANGELOG.md
Code quality: Use f-strings in more places (#320)
via ruff 0.4.2.
Revision 37 updates (#330)¶
These changes align with Unicode Technical Report #38’s 37th revision and are part of ongoing improvements to Unihan data handling.
Add kFanqie
and kZhuang
¶
Adds support for the kFanqie
and kZhuang
fields.
See also:
Support kRSUnicode
for apostrophes¶
Unihan_IRGSources
: Updated kRSUnicode
for apostrophes.
See also:
Removals¶
kFrequency
: Removed fromUnihan_DictionaryLikeData
,constants
, anddatapackage.json
.
See also:
Tests¶
Added tests for simplified expansions to ensure correctness of
kFanqie
andkZhuang
.
unihan-etl 0.34.0 (2024-03-24)¶
Development¶
Aggressive automated lint fixes via
ruff
(#317)via ruff v0.3.4, all automated lint fixes, including unsafe and previews were applied:
ruff check --select ALL . --fix --unsafe-fixes --preview --show-fixes; ruff format .
Branches were treated with:
git rebase \ --strategy-option=theirs \ --exec 'poetry run ruff check --select ALL . --fix --unsafe-fixes --preview --show-fixes; poetry run ruff format .; git add src tests; git commit --amend --no-edit' \ origin/master
poetry: 1.7.1 -> 1.8.1
See also: https://github.com/python-poetry/poetry/blob/1.8.1/CHANGELOG.md
-
Related formattings. Update CI to use
ruff check .
instead ofruff .
.See also: https://github.com/astral-sh/ruff/blob/v0.3.0/CHANGELOG.md
unihan-etl 0.33.1 (2024-02-09)¶
Maintenance release: No bug fixes or new features.
Documentation¶
README: Rewrite introduction, note updated UNIHAN compatibility information.
Link to UNIHAN release in v0.31.0’s changelog notes.
unihan-etl 0.33.0 (2024-02-09)¶
Maintenance release: No bug fixes or new features.
Documentation¶
Development¶
-
Add flake8-commas (COM)
Add flake8-builtins (A)
Add flake8-errmsg (EM)
unihan-etl 0.32.0 (2024-02-05)¶
Documentation¶
Improvements¶
Development¶
unihan-etl 0.31.0 (2024-02-04)¶
Breaking: UNIHAN upgrades (#305)¶
Bump UNIHAN compatibility from 11.0.0 to 15.1.0 (released 2023-09-01, revision 35).
Removed fields¶
15.1.0: kHKSCS, kIRGDaiKanwaZiten, kKPS0, kKPS1, kKSC0, kKSC1, kRSKangXi
13.0.0: kRSJapanese, kRSKanWa, kRSKorean
12.0.0: kDefaultSortKey (private property)
New fields¶
15.1.0: kJapanese, kMojiJoho, kSMSZD2003Index, kSMSZD2003Readings, kVietnameseNumeric, kZhuangNumeric
15.0.0: kAlternateTotalStrokes
14.0.0: kStrange
13.0.0: kIRG_SSource, kIRG_UKSource, kSpoofingVariant, kTGHZ2013, kUnihanCore2020
Development¶
unihan-etl 0.30.1 (2023-12-10)¶
Bug fix¶
unihan-etl 0.30.0post0 (2023-11-26)¶
CI¶
Move CodeQL from advanced configuration file to GitHub’s default
Documentation¶
Typo fixes
unihan-etl 0.30.0 (2023-11-26)¶
Maintenance only, no bug fixes, or new features
Development¶
Documentation¶
unihan-etl 0.29.0 (2023-11-19)¶
Maintenance only, no bug fixes, or new features
Packaging¶
Add Python 3.12 to trove classifiers
Per Poetry’s docs on managing dependencies and
poetry check
, we had it wrong: Instead of using extras, we should create these:[tool.poetry.group.group-name.dependencies] dev-dependency = "1.0.0"
Which we now do.
Development¶
Poetry: 1.6.1 -> 1.7.0
See also: https://github.com/python-poetry/poetry/blob/1.7.0/CHANGELOG.md
Move formatting from
black
toruff format
(#302)This retains the same formatting style of
black
while eliminating a dev dependency by using our existing rust-basedruff
linter.CI: Update action packages to fix warnings
dorny/paths-filter: 2.7.0 -> 2.11.1
unihan-etl 0.28.1 (2023-09-02)¶
Bug fix¶
SPACE_DELIMITED_LIST_FIELDS
: Fix for field namekAccountingNumeric
found during automated sweep for typos.
Development¶
Typo fixes
typos --format brief --write-changes
One of these typos was for
kAccountingNumeric
inSPACE_DELIMITED_LIST_FIELDS
.ruff: Remove ERA /
eradicate
pluginThis rule had too many false positives to trust. Other ruff rules have been beneficial.
unihan-etl 0.28.0 (2023-07-22)¶
Breaking: pytest fixtures now prefixed with unihan_
(#296)¶
All pytest plugin fixtures are now prefixed
unihan_
, e.g.:quick_unihan_path
->unihan_quick_path
quick_unihan_options
->unihan_quick_options
quick_unihan_packager
->unihan_quick_packager
ensure_quick_unihan
->unihan_ensure_quick
mock_zip
->unihan_mock_zip
columns
->unihan_quick_columns
TestPackager
fixture has been removedThis fixture was made redundant by
unihan_quick_*
andunihan_full_*
fixtures
Bug fixes (#296)¶
pytest plugin (
unihan_zshrc
): Fixskipif
condition to run if shell useszsh(1)
unihan-etl 0.27.0 (2023-07-18)¶
Breaking: pytest fixtures renamed, data moved (#294)¶
“quick” fixtures:
Data has been moved from
tests/fixtures
tosrc/unihan_etl/data_files/quick
Fixtures prefixed by
sample_
in the name have been renamed toquick_
“quick” and “full” fixtures: Fixed ability to access data files from outside
unihan_etl
package
Development¶
unihan-etl 0.26.0 (2023-07-09)¶
Features¶
unihan-etl 0.25.2 (2023-07-08)¶
Bug Fixes¶
unihan-etl 0.25.1 (2023-07-08)¶
Rolled back
Bug Fixes¶
unihan-etl 0.25.0 (2023-07-01)¶
Maintenance only, no bug fixes, or new features
Internal changes¶
unihan-etl 0.24.0 (2023-06-24)¶
Maintenance only, no bug fixes, or new features
Internal changes¶
unihan-etl 0.23.0 (2023-06-24)¶
Maintenance only, no bug fixes, or new features
Internal changes¶
unihan_etl._internal.app_dirs
improvements (#287)Breaking:
app_dirs
movedBefore 0.23.x:
unihan_etl.app_dirs
After 0.23.x:
unihan_etl._internal.app_dirs
New feature: Override directories on a one-off basis
New feature: Template replacement of variables replacing environmental variables via
os.path.expandvars()
+os.path.expanduser()
doctests
: See the above in action thanks to doctestsDedicated tests via pytest
Documentation¶
unihan-etl 0.22.1 (2023-06-18)¶
Bug fixes¶
unihan-etl 0.22.0 (2023-06-17)¶
Breaking changes¶
unihan_etl.process
-> unihan_etl.core
(#284)¶
This module has been renamed.
Configuration (#280)¶
Before 0.22.x, unihan_etl’s configuration was done through a dict
object.
0.22.0 and after settings are configurable via a dataclasses.dataclass
object:
unihan_etl.options.Options
Documentation¶
unihan-etl 0.21.1 (2023-06-18)¶
Bug fixes¶
unihan-etl 0.21.0 (2023-06-12)¶
Maintenance only, no bug fixes or features
Internal improvements¶
unihan-etl 0.20.0 (2023-06-11)¶
Maintenance only, no bug fixes or features
Breaking changes¶
Python 3.7 Dropped
Python 3.7 support has been dropped (#272)
Its end-of-life is June 27th, 2023 and Python 3.8 will add support for
typing
’styping.TypedDict
andtyping.Protocol
out of the box without needingtyping_extensions
’s.
Internal improvements¶
unihan-etl 0.19.1 (2023-05-28)¶
Maintenance only, no bug fixes or features
Development¶
Add back
black
for formattingThis is still necessary to accompany
ruff
, until it replaces black.
unihan-etl 0.19.0 (2023-05-27)¶
Maintenance only, no bug fixes or features
Internal improvements¶
Move formatting, import sorting, and linting to ruff.
This rust-based checker has dramatically improved performance. Linting and formatting can be done almost instantly.
This change replaces black, isort, flake8 and flake8 plugins.
poetry: 1.4.0 -> 1.5.0
See also: https://github.com/python-poetry/poetry/releases/tag/1.5.0
pytest: Fix invalid escape sequence warning from
zhon
Development¶
unihan-etl 0.18.1 (2022-10-01)¶
Packaging¶
Add PyYAML dependency
Infrastructure¶
Packaging¶
unihan-etl 0.18.0 (2022-09-11)¶
Development¶
Documentation¶
Render changelog in
linkify_issues
(#261, #265)Fix Table of contents rendering with sphinx autodoc with
sphinx_toctree_autodoc_fix
(#265)Test doctests in our docs via
pytest_doctest_docutils
(built ondoctest_docutils
) (#265)
unihan-etl 0.17.2 (2022-08-21)¶
Documentation¶
unihan-etl 0.17.1 (2022-08-21)¶
Fixes¶
merged_dict()
: Fix merging edgecase where destination key was missingdownload()
: Fix edgecase when “downloading” file from local path
unihan-etl 0.17.0 (2022-08-21)¶
Features¶
unihan-etl 0.16.0 (2022-08-20)¶
Features¶
Development¶
Add python 3.8 and 3.9 to CI
This is to make way for strict type annotations, as the typings and generic behavior vary dramatically between 3.7 - 3.11.
unihan-etl 0.15.0 (2022-08-29)¶
Breaking changes¶
unihan-etl 0.14.0 (2022-08-16)¶
Improvements¶
load_data
: Accept list ofpathlib.Path
in addition to list ofstr
Compatibility¶
Development¶
Infrastructure updates for static type checking and doctest examples.
Documentation¶
Move to
furo
themeAdd :ref:
quickstart
pageLink to cihai’s developer documentation: https://cihai.git-pull.com/contributing/
unihan-etl 0.13.0 (2021-06-16)¶
#236: Convert to markdown
unihan-etl 0.12.0 (2021-06-15)¶
Update
black
to 21.6b0Update trove classifiers to 3.9
#235: Drop python 2.7, 3.5. Remove python 2 modesets and
__future__
unihan-etl 0.11.0 (2020-08-09)¶
unihan-etl 0.10.4 (2020-08-05)¶
Update CHANGES headings to produce working links
Relax
appdirs
version constraint#228 Move from Pipfile to poetry
unihan-etl 0.10.3 (2019-08-18)¶
Fix flicker in download progress bar
unihan-etl 0.10.2 (2019-08-17)¶
Add
project_urls
to setup.pyUse plain reStructuredText for CHANGES
Use
collections
that’s compatible with python 2 and 3PEP8 tweaks
unihan-etl 0.10.1 (2017-09-08)¶
Add code links in API
Add
__version__
tounihan_etl
unihan-etl 0.10.0 (2017-08-29)¶
#91 New fields from UNIHAN Revision 25.
kJinmeiyoKanji
kJoyoKanji
kKoreanEducationHanja
kKoreanName
kTGH
UNIHAN Revision 25 was released 2018-05-18 and issued for Unicode 11.0:
Add tests and example corpus for kCCCII
Add configuration / make tests for isort, flake8
Switch tmuxp config to use pipenv
Add Pipfile
Add
make sync_pipfile
task to sync requirements/.txt* files with *Pipfile*Update and sync Pipfile
Developer package updates (linting / docs / testing)
isort 4.2.15 to 4.3.4
flake8 3.3.0 to 3.5.0
vulture 0.14 to 0.27
sphinx 1.6.2 to 1.7.6
alagitpull 0.0.12 to 0.0.21
releases 1.3.1 to 1.6.1
sphinx-argparse 0.2.1 to 1.6.2
pytest 3.1.2 to 3.6.4
Move documentation over to numpy-style
Add sphinxcontrib-napoleon 0.6.1
Update LICENSE New BSD to MIT
All future commits and contributions are licensed to the cihai software foundation. This includes commits by Tony Narlock (creator).
unihan-etl 0.9.5 (2017-06-26)¶
Enhance support for locations on kHDZRadBreak fields.
unihan-etl 0.9.4 (2017-06-05)¶
Fix kIRG_GSource without location
Fix kFenn output
Fix kHanyuPinlu support output for n diacritics
unihan-etl 0.9.3 (2017-05-31)¶
Add expansion for kIRGKangXi
unihan-etl 0.9.2 (2017-05-31)¶
Normalize Radical-Stroke expansion for kRSUnicode
Migrate more fields to regular expressions
Normalize character field for kDaeJaweon, kHanyuPinyin, and kCheungBauer, kFennIndex, kCheungBauerIndex, kIICore, kIRGHanyuDaZidian
unihan-etl 0.9.1 (2017-05-27)¶
Support for expanding kGSR
Convert some field expansions to use regexes
unihan-etl 0.9.0 (2017-05-26)¶
Fix bug where destination file was made into directory on first run
Rename from unihan-tabular to unihan-etl
Support for expanding multi-value fields
Support for pruning empty fields
Improve help dialog
Added a page about UNIHAN and the project to documentation
Split constant values into their own module
Split functionality for expanding unstructured values into its own module
unihan-etl 0.8.1 (2017-05-20)¶
Update to add kJa and adjust source file of kCompatibilityVariant per Unicode 8.0.0.
unihan-etl 0.8.0 (2017-05-17)¶
Support for configuring logging via options and CLI
Convert all print statements to use logger
unihan-etl 0.7.4 (2017-05-14)¶
Allow for local / file system sources for Unihan.zip
Only extract zip if unextracted
unihan-etl 0.7.3 (2017-05-13)¶
Update package classifiers
unihan-etl 0.7.2 (2017-05-13)¶
Add back datapackage
unihan-etl 0.7.1 (2017-05-12)¶
Fix python 2 CSV output
Default to CSV output
unihan-etl 0.7.0 (2017-05-12)¶
Move unicodecsv module to dependency package
Support for XDG directory specification
Support for custom destination output, including replacing template variable
{ext}
unihan-etl 0.6.3 (2017-05-11)¶
Move about.py to module level
unihan-etl 0.6.2 (2017-05-11)¶
Fix python package import
unihan-etl 0.6.1 (2017-05-10)¶
Fix readme bug on pypi
unihan-etl 0.6.0 (2017-05-10)¶
Support for exporting in YAML and JSON
More internal factoring and simplification
Return data as list
unihan-etl 0.5.1 (2017-05-08)¶
Drop python 3.3 an 3.4 support
unihan-etl 0.5.0 (2017-05-08)¶
Rename from cihaidata_unihan unihan_tabular
Drop datapackages in favor of a universal JSON, YAML and CSV export.
Only use UnicodeWriter in Python 2, fixes issue with python would encode
b
in front of values
unihan-etl 0.4.2 (2017-05-07)¶
Rename scripts/ to cihaidata_unihan/
unihan-etl 0.4.1 (2017-05-07)¶
Enable invoking tool via
$ cihaidata_unihan
unihan-etl 0.4.0 (2017-05-07)¶
Major internal refactor and simplification
Convert to pytest
assert
statementsConvert full test suite to pytest functions and fixtures
Get CLI documentation up again
Improve test coverage
Lint code, remove unused imports
Switch license BSD -> MIT
unihan-etl 0.3.0 (2017-04-17)¶
Rebooted
Modernize Makefile in docs
Add Makefile to main project
Modernize package metadata to use about.py
Update requirements to use requirements/ folder for base, testing and doc dependencies.
Update sphinx theme to alabaster with new logo.
Update travis to use coverall
Update links on README to use https
Update travis to test up to python 3.6
Add support for pypy (why not)
Lock base dependencies
Add dev dependencies for isort, vulture and flake8