src/myspellchecker/rules/ and can be customized for specific use cases.
Rule Files Overview
| File | Purpose | Entries |
|---|---|---|
particles.yaml | Linguistic particles with POS tags | 91 |
typo_corrections.yaml | Common typo patterns | 68 |
morphology.yaml | Suffix/prefix patterns | 100 |
grammar_rules.yaml | Grammar validation rules | 45 |
aspects.yaml | Verb aspect markers | 48 |
compounds.yaml | Compound word patterns | 63 |
classifiers.yaml | Numeral classifiers | 74 |
negation.yaml | Negation patterns | 55 |
register.yaml | Formal/colloquial mappings | 38 |
tone_rules.yaml | Tone mark rules | 57 |
homophones.yaml | Homophone pairs | 115 |
ambiguous_words.yaml | Multi-POS words | 54 |
pos_inference.yaml | POS inference patterns | 94 |
pronouns.yaml | Pronoun definitions | 30 |
File Structure
All rule files follow a common structure:Particles (particles.yaml)
Defines Myanmar linguistic particles organized by syntactic function.Structure
POS Tags
| Tag | Description | Example |
|---|---|---|
P_PAST | Past tense | ခဲ့ |
P_FUT | Future tense | မယ်, မည် |
P_PROG | Progressive | နေ |
P_PERF | Perfective | ပြီ |
P_SUBJ | Subject marker | က |
P_OBJ | Object marker | ကို |
P_LOC | Locative | မှာ, တွင် |
P_SENT | Sentence ending | တယ်, သည် |
P_MOD | Modifier | ရဲ့, ၏ |
Formality Levels
colloquial- Spoken/informalneutral- Both formal and informalformal- Written/formalpolite- Respectful registerliterary- Literary style
Typo Corrections (typo_corrections.yaml)
Defines common Myanmar typo patterns with corrections.Structure
Error Types
| Type | Description |
|---|---|
missing_ha_htoe | Missing ှ modifier |
character_confusion | Similar looking characters |
ya_pin_ra_yit | ျ vs ြ confusion |
missing_asat | Missing ် marker |
tone_mark_error | Wrong or missing tone mark |
visual_similar | OCR-type errors |
Context Types
after_noun- Follows a nounafter_verb- Follows a verbcontext_dependent- Requires context analysisstandalone- Independent of context
Morphology (morphology.yaml)
Defines suffix and prefix patterns for POS inference.Structure
Aspects (aspects.yaml)
Defines verb aspect markers.Structure
Classifiers (classifiers.yaml)
Defines numeral classifiers for counting.Structure
Register (register.yaml)
Maps formal and colloquial equivalents.Structure
Negation (negation.yaml)
Defines negation patterns.Structure
Homophones (homophones.yaml)
Defines homophone pairs for context checking.Structure
Compounds (compounds.yaml)
Defines compound word formations.Structure
Custom Configuration
Loading Custom Rules
Extending Rules
Add custom entries by creating additional YAML files:Schema Validation
Rule files are validated against JSON schemas insrc/myspellchecker/schemas/:
grammar_rules.schema.jsonmorphology.schema.jsonparticles.schema.jsontypo_corrections.schema.json_common.schema.json
Best Practices
- Confidence scores: Use 0.9+ for high-certainty rules, 0.7-0.9 for moderate, below 0.7 for context-dependent
- Context constraints: Always specify context when rules are position-dependent
- Examples: Include examples for documentation and testing
- Version control: Update
metadata.last_updatedwhen modifying rules - Testing: Test rule changes with representative corpus data
See Also
- Grammar Checkers - Grammar validation
- Morphology Analysis - Word structure
- Configuration - Configuration guide