Expansion - unihan_etl.expansion
¶
Functions to uncompact details inside field values.
Notes
re.compile()
operations are inside of expand functions:
readability
module-level function bytecode is cached in python
the last used compiled regexes are cached
- unihan_etl.expansion.N_DIACRITICS = 'ńňǹ'¶
diacritics from kHanyuPinlu
- unihan_etl.expansion.expand_kMandarin(value)[source]¶
Expand kMandarin field.
- Return type:
- Parameters:
- unihan_etl.expansion.expand_kTotalStrokes(value)[source]¶
Expand kTotalStrokes field.
- Return type:
- Parameters:
- class unihan_etl.expansion.kAlternateTotalStrokesDict[source]¶
Bases:
TypedDict
kAlternateTotalStrokes mapping.
- unihan_etl.expansion.is_valid_kAlternateTotalStrokes_irg_source(value)[source]¶
Return True and upcast if valid kAlternateTotalStrokes source.
- Return type:
TypeGuard[kAlternateTotalStrokesLiteral]
- Parameters:
value (Any) –
- unihan_etl.expansion.expand_kAlternateTotalStrokes(value)[source]¶
Expand kAlternateTotalStrokes field.
- Return type:
- Parameters:
Examples
>>> expand_kAlternateTotalStrokes(['3:J']) [{'strokes': 3, 'sources': ['J']}]
>>> expand_kAlternateTotalStrokes(['12:JK']) [{'strokes': 12, 'sources': ['J', 'K']}]
>>> expand_kAlternateTotalStrokes(['-']) [{'strokes': None, 'sources': ['-']}]
- unihan_etl.expansion.expand_kUnihanCore2020(value)[source]¶
Expand kUnihanCore2020 field.
Examples
>>> expand_kUnihanCore2020('GHJ') ['G', 'H', 'J']
- unihan_etl.expansion.expand_kIRGHanyuDaZidian(value)[source]¶
Expand kIRGHanyuDaZidian field.
- Return type:
- Parameters:
- class unihan_etl.expansion.kTGHZ2013LocationDict[source]¶
Bases:
TypedDict
kTGHZ2013 location mapping.
- class unihan_etl.expansion.kTGHZ2013Dict[source]¶
Bases:
TypedDict
kTGHZ2013 mapping.
-
locations:
Sequence
[kTGHZ2013LocationDict
]¶
-
locations:
- unihan_etl.expansion.expand_kTGHZ2013(value)[source]¶
Expand kTGHZ2013 field.
- Return type:
- Parameters:
Examples
>>> expand_kTGHZ2013(['097.110,097.120:fēng']) [{'reading': 'fēng', 'locations': [{'page': 97, 'position': 11, 'entry_type': 0}, {'page': 97, 'position': 12, 'entry_type': 0}]}]
>>> expand_kTGHZ2013(['482.140:zhòu']) [{'reading': 'zhòu', 'locations': [{'page': 482, 'position': 14, 'entry_type': 0}]}]
>>> expand_kTGHZ2013(['256.090:mò', '379.160:wàn']) [{'reading': 'mò', 'locations': [{'page': 256, 'position': 9, 'entry_type': 0}]}, {'reading': 'wàn', 'locations': [{'page': 379, 'position': 16, 'entry_type': 0}]}]
- class unihan_etl.expansion.kSMSZD2003IndexDict[source]¶
Bases:
TypedDict
kSMSZD2003Index location mapping.
- unihan_etl.expansion.expand_kSMSZD2003Index(value)[source]¶
Expand kSMSZD2003Index Soengmou San Zidin (商務新字典) field.
- Return type:
- Parameters:
Examples
>>> expand_kSMSZD2003Index(['26.07']) [{'page': 26, 'position': 7}]
>>> expand_kSMSZD2003Index(['769.05', '15.17', '291.20', '493.13']) [{'page': 769, 'position': 5}, {'page': 15, 'position': 17}, {'page': 291, 'position': 20}, {'page': 493, 'position': 13}]
Bibliography¶
Wong Gongsang 黃港生, ed. Shangwu Xin Zidian / Soengmou San Zidin 商務新字典 (New Commercial Press Character Dictionary). Hong Kong: 商務印書館(香港)有限公司 (Commercial Press [Hong Kong], Ltd.), 2003. ISBN 962-07-0140-2.
- class unihan_etl.expansion.kSMSZD2003ReadingsDict[source]¶
Bases:
TypedDict
kSMSZD2003Readings location mapping.
- unihan_etl.expansion.expand_kSMSZD2003Readings(value)[source]¶
Expand kSMSZD2003Readings Soengmou San Zidin (商務新字典) field.
- Return type:
- Parameters:
Examples
>>> expand_kSMSZD2003Readings(['tà粵taat3']) [{'mandarin': ['tà'], 'cantonese': ['taat3']}]
>>> expand_kSMSZD2003Readings(['ma粵maa1,maa3', 'má粵maa1', 'mǎ粵maa1']) [{'mandarin': ['ma'], 'cantonese': ['maa1', 'maa3']}, {'mandarin': ['má'], 'cantonese': ['maa1']}, {'mandarin': ['mǎ'], 'cantonese': ['maa1']}]
Bibliography¶
Wong Gongsang 黃港生, ed. Shangwu Xin Zidian / Soengmou San Zidin 商務新字典 (New Commercial Press Character Dictionary). Hong Kong: 商務印書館(香港)有限公司 (Commercial Press [Hong Kong], Ltd.), 2003. ISBN 962-07-0140-2.
- class unihan_etl.expansion.kHanyuPinyinPreDict[source]¶
Bases:
TypedDict
kHanyuPinyin predicate mapping.
-
locations:
Sequence
[Union
[str
,kLocationDict
]]¶
-
locations:
- class unihan_etl.expansion.kHanyuPinyinDict[source]¶
Bases:
TypedDict
kHanyuPinyin mapping.
-
locations:
kLocationDict
¶
-
locations:
- unihan_etl.expansion.expand_kHanyuPinyin(value)[source]¶
Expand kHanyuPinyin field.
- Return type:
- Parameters:
- class unihan_etl.expansion.kXHC1983LocationDict[source]¶
Bases:
TypedDict
kXHC1983 location mapping.
- class unihan_etl.expansion.kXHC1983Dict[source]¶
Bases:
TypedDict
kXHC1983 mapping.
-
locations:
kXHC1983LocationDict
¶
-
locations:
- class unihan_etl.expansion.kXHC1983PreDict[source]¶
Bases:
TypedDict
kXHC1983 predicate mapping.
-
locations:
Union
[List
[str
],kXHC1983LocationDict
]¶
-
locations:
- unihan_etl.expansion.expand_kXHC1983(value)[source]¶
Expand kXHC1983 field.
- Return type:
- Parameters:
- unihan_etl.expansion.expand_kCheungBauer(value)[source]¶
Expand kCheungBauer field.
- Return type:
- Parameters:
- unihan_etl.expansion.expand_kRSAdobe_Japan1_6(value)[source]¶
Expand kRSAdobe_Japan1_6 field.
- Return type:
- Parameters:
- unihan_etl.expansion.expand_kDaeJaweon(value)[source]¶
Expand kDaeJaweon field.
- Return type:
- Parameters:
value (str) –
- unihan_etl.expansion.expand_kIRGKangXi(value)[source]¶
Expand kIRGKangXi field.
- Return type:
- Parameters:
- unihan_etl.expansion.expand_kIRGDaeJaweon(value)[source]¶
Expand kIRGDaeJaweon field.
- Return type:
- Parameters:
- unihan_etl.expansion.expand_kHanyuPinlu(value)[source]¶
Expand kHanyuPinlu field.
- Return type:
- Parameters:
- class unihan_etl.expansion.kHDZRadBreakDict[source]¶
Bases:
TypedDict
kHDZRadBreak mapping.
-
location:
LocationDict
¶
-
location:
- unihan_etl.expansion.expand_kHDZRadBreak(value)[source]¶
Expand kHDZRadBreak field.
- Return type:
- Parameters:
value (str) –
- unihan_etl.expansion._expand_kRSGeneric(value)[source]¶
Expand kRSGeneric field.
- Return type:
- Parameters:
Examples
>>> _expand_kRSGeneric(['5.10', "213''.0"]) [{'radical': 5, 'strokes': 10, 'simplified': False}, {'radical': 213, 'strokes': 0, 'simplified': False}]
- unihan_etl.expansion.expand_kRSUnicode(value)[source]¶
Expand kRSGeneric field.
- Return type:
- Parameters:
Examples
>>> _expand_kRSGeneric(['5.10', "213''.0"]) [{'radical': 5, 'strokes': 10, 'simplified': False}, {'radical': 213, 'strokes': 0, 'simplified': False}]
- unihan_etl.expansion._expand_kIRG_GenericSource(value)[source]¶
Expand kIRG_GenericSource field.
- Return type:
- Parameters:
value (str) –
Examples
>>> _expand_kIRG_GenericSource('JMJ-056876') {'source': 'JMJ', 'location': '056876'} >>> _expand_kIRG_GenericSource('SAT-02570') {'source': 'SAT', 'location': '02570'}
- unihan_etl.expansion.expand_kIRG_GSource(value)[source]¶
Expand kIRG_GenericSource field.
- Return type:
- Parameters:
value (str) –
Examples
>>> _expand_kIRG_GenericSource('JMJ-056876') {'source': 'JMJ', 'location': '056876'} >>> _expand_kIRG_GenericSource('SAT-02570') {'source': 'SAT', 'location': '02570'}
- unihan_etl.expansion.expand_kIRG_HSource(value)[source]¶
Expand kIRG_GenericSource field.
- Return type:
- Parameters:
value (str) –
Examples
>>> _expand_kIRG_GenericSource('JMJ-056876') {'source': 'JMJ', 'location': '056876'} >>> _expand_kIRG_GenericSource('SAT-02570') {'source': 'SAT', 'location': '02570'}
- unihan_etl.expansion.expand_kIRG_JSource(value)[source]¶
Expand kIRG_GenericSource field.
- Return type:
- Parameters:
value (str) –
Examples
>>> _expand_kIRG_GenericSource('JMJ-056876') {'source': 'JMJ', 'location': '056876'} >>> _expand_kIRG_GenericSource('SAT-02570') {'source': 'SAT', 'location': '02570'}
- unihan_etl.expansion.expand_kIRG_KPSource(value)[source]¶
Expand kIRG_GenericSource field.
- Return type:
- Parameters:
value (str) –
Examples
>>> _expand_kIRG_GenericSource('JMJ-056876') {'source': 'JMJ', 'location': '056876'} >>> _expand_kIRG_GenericSource('SAT-02570') {'source': 'SAT', 'location': '02570'}
- unihan_etl.expansion.expand_kIRG_KSource(value)[source]¶
Expand kIRG_GenericSource field.
- Return type:
- Parameters:
value (str) –
Examples
>>> _expand_kIRG_GenericSource('JMJ-056876') {'source': 'JMJ', 'location': '056876'} >>> _expand_kIRG_GenericSource('SAT-02570') {'source': 'SAT', 'location': '02570'}
- unihan_etl.expansion.expand_kIRG_MSource(value)[source]¶
Expand kIRG_GenericSource field.
- Return type:
- Parameters:
value (str) –
Examples
>>> _expand_kIRG_GenericSource('JMJ-056876') {'source': 'JMJ', 'location': '056876'} >>> _expand_kIRG_GenericSource('SAT-02570') {'source': 'SAT', 'location': '02570'}
- unihan_etl.expansion.expand_kIRG_SSource(value)[source]¶
Expand kIRG_GenericSource field.
- Return type:
- Parameters:
value (str) –
Examples
>>> _expand_kIRG_GenericSource('JMJ-056876') {'source': 'JMJ', 'location': '056876'} >>> _expand_kIRG_GenericSource('SAT-02570') {'source': 'SAT', 'location': '02570'}
- unihan_etl.expansion.expand_kIRG_TSource(value)[source]¶
Expand kIRG_GenericSource field.
- Return type:
- Parameters:
value (str) –
Examples
>>> _expand_kIRG_GenericSource('JMJ-056876') {'source': 'JMJ', 'location': '056876'} >>> _expand_kIRG_GenericSource('SAT-02570') {'source': 'SAT', 'location': '02570'}
- unihan_etl.expansion.expand_kIRG_USource(value)[source]¶
Expand kIRG_GenericSource field.
- Return type:
- Parameters:
value (str) –
Examples
>>> _expand_kIRG_GenericSource('JMJ-056876') {'source': 'JMJ', 'location': '056876'} >>> _expand_kIRG_GenericSource('SAT-02570') {'source': 'SAT', 'location': '02570'}
- unihan_etl.expansion.expand_kIRG_UKSource(value)[source]¶
Expand kIRG_GenericSource field.
- Return type:
- Parameters:
value (str) –
Examples
>>> _expand_kIRG_GenericSource('JMJ-056876') {'source': 'JMJ', 'location': '056876'} >>> _expand_kIRG_GenericSource('SAT-02570') {'source': 'SAT', 'location': '02570'}
- unihan_etl.expansion.expand_kIRG_VSource(value)[source]¶
Expand kIRG_GenericSource field.
- Return type:
- Parameters:
value (str) –
Examples
>>> _expand_kIRG_GenericSource('JMJ-056876') {'source': 'JMJ', 'location': '056876'} >>> _expand_kIRG_GenericSource('SAT-02570') {'source': 'SAT', 'location': '02570'}
- unihan_etl.expansion.is_valid_kstrange_property(value)[source]¶
Return True and upcast if valid kStrange property type.
- Return type:
TypeGuard[kStrangeLiteral]
- Parameters:
value (Any) –
- unihan_etl.expansion.expand_kStrange(value)[source]¶
Expand kStrange field.
- Return type:
- Parameters:
Examples
>>> expand_kStrange(['B:U+310D', 'I:U+5DDB']) [{'property_type': 'B', 'characters': ['U+310D']}, {'property_type': 'I', 'characters': ['U+5DDB']}]
>>> expand_kStrange(['K:U+30A6:U+30C4:U+30DB']) [{'property_type': 'K', 'characters': ['U+30A6', 'U+30C4', 'U+30DB']}]
>>> expand_kStrange(['U']) [{'property_type': 'U', 'characters': []}]
- class unihan_etl.expansion.kMojiJohoVariationDict[source]¶
Bases:
TypedDict
Variation sequence of Moji Jōhō Kiban entry.
- class unihan_etl.expansion.kMojiJohoDict[source]¶
Bases:
TypedDict
kMojiJoho mapping.
-
variants:
List
[kMojiJohoVariationDict
]¶
-
variants:
- unihan_etl.expansion.expand_kMojiJoho(value)[source]¶
Expand kMojiJoho (Moji Jōhō Kiban) field.
- Return type:
- Parameters:
value (str) –
Examples
>>> expand_kMojiJoho('MJ000004') {'serial_number': 'MJ000004', 'variants': []}
>>> expand_kMojiJoho('MJ000022 MJ000023:E0101 MJ000022:E0103') {'serial_number': 'MJ000022', 'variants': [{'serial_number': 'MJ000023', 'variation_sequence': 'E0101', 'standard': False}, {'serial_number': 'MJ000022', 'variation_sequence': 'E0103', 'standard': True}]}
See also
Assume
U+342A kMojiJoho MJ000022 MJ000023:E0101 MJ000022:E0103:
Database