src/myspellchecker/core/constants/myanmar_constants.py and src/myspellchecker/core/constants/core_constants.py
Unicode Ranges
Main Myanmar Block
Extended Blocks
Extended-A Character Ranges (U+AA60–U+AA7F)
| Range | Usage |
|---|---|
| U+AA60–U+AA6F | Shan consonants |
| U+AA70–U+AA76 | Shan vowels and tones |
| U+AA77–U+AA79 | Shan symbols |
Extended-B Character Ranges (U+A9E0–U+A9FF)
| Range | Usage |
|---|---|
| U+A9E0–U+A9E4 | Shan letters |
| U+A9E5 | Shan sign |
| U+A9E6 | Reduplication mark |
Character Set Constants
Helper Functions
Character Sets
Consonants
| Char | Code Point | Name | Romanization |
|---|---|---|---|
| က | U+1000 | KA | ka |
| ခ | U+1001 | KHA | kha |
| ဂ | U+1002 | GA | ga |
| ဃ | U+1003 | GHA | gha |
| င | U+1004 | NGA | nga |
| စ | U+1005 | CA | sa |
| ဆ | U+1006 | CHA | hsa |
| ဇ | U+1007 | JA | za |
| ဈ | U+1008 | JHA | zha |
| ဉ | U+1009 | NYA (archaic) | nya |
| ည | U+100A | NYA | nya |
| ဋ | U+100B | TTA | tta |
| ဌ | U+100C | TTHA | ttha |
| ဍ | U+100D | DDA | dda |
| ဎ | U+100E | DDHA | ddha |
| ဏ | U+100F | NNA | nna |
| တ | U+1010 | TA | ta |
| ထ | U+1011 | THA | hta |
| ဒ | U+1012 | DA | da |
| ဓ | U+1013 | DHA | dha |
| န | U+1014 | NA | na |
| ပ | U+1015 | PA | pa |
| ဖ | U+1016 | PHA | pha |
| ဗ | U+1017 | BA | ba |
| ဘ | U+1018 | BHA | bha |
| မ | U+1019 | MA | ma |
| ယ | U+101A | YA | ya |
| ရ | U+101B | RA | ra |
| လ | U+101C | LA | la |
| ဝ | U+101D | WA | wa |
| သ | U+101E | SA | tha |
| ဟ | U+101F | HA | ha |
| ဠ | U+1020 | LLA | lla |
| ဿ | U+103F | GREAT SA (added separately) | ssa |
Non-Standard Characters
Independent Vowels
INDEPENDENT_VOWELS_STRICT for standard Burmese-only validation.
Vowel Signs (Dependent Vowels)
| Char | Code Point | Name |
|---|---|---|
| ါ | U+102B | TALL AA |
| ာ | U+102C | AA |
| ိ | U+102D | I |
| ီ | U+102E | II |
| ု | U+102F | U |
| ူ | U+1030 | UU |
| ေ | U+1031 | E (left-side, stored after consonant) |
| ဲ | U+1032 | AI |
Vowel Classification
Medials
Signs and Marks
Tone Mark Rules
Myanmar Numerals
Punctuation
Section Marks and Logographic Particles
Validation Sets
Valid Medial Sequences
Normalization Order Weights
Zero-Width Characters
Medial Compatibility Sets
Defines which consonants can validly combine with each medial.Phonetic Character Sets
Stacking and Kinzi
Part-of-Speech Tag Constants
Granular particle tag constants defined incore_constants.py:
Skipped Context Words
High-frequency particles skipped by the Context Validator:Morphology Constants
Suffix sets for OOV POS guessing (fromcore_constants.py):
Enums
Algorithm Defaults
Usage in Code
Importing Constants
Character Classification
Validation Helper
Canonical Character Ordering (UTN #11)
Correct canonical order for Myanmar syllable characters (Unicode storage order):မြန်), interleaving with vowels. This is
distinct from its use in Kinzi patterns where it appears earlier.
See Also
- Syllable Validation - Structure rules
- Error Codes - Validation error codes