General Questions
What is mySpellChecker?
mySpellChecker is a high-performance spell checker specifically designed for Myanmar (Burmese) language text. Unlike English spell checkers that rely on whitespace to identify words, mySpellChecker uses a “Syllable-First Architecture” that validates text at the syllable level first, then progressively validates words and context.Why was mySpellChecker created?
Myanmar script presents unique challenges for spell checking:- No spaces between words
- Complex syllable structures
- Multiple valid Unicode representations
- Legacy Zawgyi encoding still in use
Is mySpellChecker open source?
Yes, mySpellChecker is open source and available under the MIT license.What Python versions are supported?
mySpellChecker supports Python 3.10 and later.Installation
Why does installation take a long time?
mySpellChecker includes Cython extensions that are compiled during installation. This requires a C++ compiler and takes extra time. If you don’t have a compiler, the library will fall back to pure Python implementations (slower but functional).Do I need a C++ compiler?
No, it’s optional but recommended. Without a compiler:- Installation is faster
- Pure Python fallbacks are used
- Performance is lower (but still functional)
- Installation compiles Cython extensions
- Performance is significantly better
- OpenMP parallel processing is available (Linux/macOS)
How do I install on macOS without compiler errors?
Can I install without the optional features?
Yes, install the base package only:Usage
How do I check Myanmar text?
What is the difference between validation levels?
| Level | Coverage | Speed | Use Case |
|---|---|---|---|
syllable | 90% of errors | ~10ms | Quick validation |
word | 95% of errors | ~50ms | Standard checking |
use_context_checker=True parameter, not as a separate validation level.
How do I handle Zawgyi text?
Can I use a custom dictionary?
Yes, you can build a custom dictionary from your own corpus:How do I add words to the dictionary at runtime?
Currently, the dictionary is read-only at runtime. To add words:- Add them to your corpus file
- Rebuild the dictionary
- Restart your application
Performance
How can I make spell checking faster?
-
Use syllable-level validation only (fastest):
-
Disable context checking:
-
Use batch processing:
-
Ensure Cython is compiled:
Why is the first call slow?
The first call initializes the dictionary and loads models into memory. Subsequent calls are much faster. Consider warming up the checker:How much memory does mySpellChecker use?
| Configuration | Memory Usage |
|---|---|
| Basic (SQLite provider) | ~50MB |
| Memory provider | ~200MB |
| With semantic model | ~500MB |
| With transformer POS | ~1GB |
Can I use mySpellChecker in a multi-threaded application?
TheSpellChecker instance is not thread-safe by default. For multi-threaded use:
- Create separate instances per thread
- Or use a connection pool for the SQLite provider
- Or use the Memory provider (which is thread-safe for reads)
Accuracy
How accurate is mySpellChecker?
Accuracy depends on the validation level and corpus quality:| Level | Precision | Recall | F1 |
|---|---|---|---|
| Syllable | 95% | 90% | 92% |
| Word | 92% | 95% | 93% |
| Word + Context Checker | 88% | 98% | 93% |
Why does it mark valid words as errors?
Common reasons:- Word not in dictionary: Add to custom dictionary
- Rare spelling variant: Check corpus coverage
- Foreign word: Myanmar text with English/Pali words
- Proper noun: Names often flagged as unknown
Why doesn’t it catch obvious errors?
Common reasons:- Real-word error: The misspelling is a valid word (enable
use_context_checker=True) - Validation level too low: Increase to
wordlevel - Missing grammar rules: Some patterns not covered
How can I improve accuracy?
- Build from a larger corpus: More data = better suggestions
- Enable context validation: Catches real-word errors
- Use semantic checking: Requires AI extras
- Report false positives/negatives: Help improve the library
Dictionary Building
How do I build a dictionary?
What format should my corpus be?
Plain text with Myanmar content:How big should my corpus be?
| Corpus Size | Coverage | Recommendation |
|---|---|---|
| < 100KB | Basic | Testing only |
| 100KB - 1MB | Good | Personal use |
| 1MB - 10MB | Very Good | Production |
| > 10MB | Excellent | Professional |
Can I use multiple corpora?
Yes, use incremental building:Integration
Can I use mySpellChecker in a web application?
Yes, see the Integration Guide for FastAPI, Flask, and Django examples. The library providescheck_async() for non-blocking use in async frameworks.
Is there a REST API?
The library doesn’t include a built-in API server, but you can easily create one:Can I use it with VS Code?
A VS Code extension is planned for future development. Currently, you can:- Use the CLI for manual checking
- Create a custom script that integrates with your editor
Does it work with Django?
Yes, see the Integration Guide for Django service pattern and DRF examples.Troubleshooting
See Also
- Troubleshooting Guide - Common issues and solutions
- Error Codes - Error code reference
- Configuration - Configuration options
Contributing
How can I contribute?
- Report bugs via GitHub issues
- Submit pull requests for fixes
- Improve documentation
- Share your custom dictionaries or corpora
Where is the roadmap?
Check the GitHub repository for planned features and roadmap.How do I report a bug?
Open a GitHub issue with:- Python version
- mySpellChecker version
- Minimal reproduction code
- Expected vs actual behavior