Frequently Asked Questions

Answers to the most common questions about mySpellChecker — from setup and configuration to performance optimization and web framework integration.

General Questions

What is mySpellChecker?

mySpellChecker is a high-performance spell checker specifically designed for Myanmar (Burmese) language text. Unlike English spell checkers that rely on whitespace to identify words, mySpellChecker uses a “Syllable-First Architecture” that validates text at the syllable level first, then progressively validates words and context.

Why was mySpellChecker created?

Myanmar script presents unique challenges for spell checking:

No spaces between words
Complex syllable structures
Multiple valid Unicode representations
Legacy Zawgyi encoding still in use

Existing tools either don’t support Myanmar or don’t handle these challenges well.

Is mySpellChecker open source?

Yes, mySpellChecker is open source and available under the MIT license.

What Python versions are supported?

mySpellChecker supports Python 3.10 and later.

Installation

Why does installation take a long time?

mySpellChecker includes Cython extensions that are compiled during installation. This requires a C++ compiler and takes extra time. If you don’t have a compiler, the library will fall back to pure Python implementations (slower but functional).

Do I need a C++ compiler?

No, it’s optional but recommended. Without a compiler:

Installation is faster
Pure Python fallbacks are used
Performance is lower (but still functional)

With a compiler:

Installation compiles Cython extensions
Performance is significantly better
OpenMP parallel processing is available (Linux/macOS)

How do I install on macOS without compiler errors?

# Install Xcode command line tools
xcode-select --install

# For OpenMP support (optional)
brew install libomp

# Then install normally
pip install myspellchecker

Can I install without the optional features?

Yes, install the base package only:

pip install myspellchecker

Optional features are installed separately:

pip install myspellchecker[ai]          # Semantic checking
pip install myspellchecker[transformers] # Transformer POS tagger

Usage

How do I check Myanmar text?

from myspellchecker import SpellChecker

checker = SpellChecker()
result = checker.check("မြန်မာစာ")

if result.has_errors:
    for error in result.errors:
        print(error.text, error.suggestions)

What is the difference between validation levels?

Level	Coverage	Speed	Use Case
`syllable`	90% of errors	~10ms	Quick validation
`word`	95% of errors	~50ms	Standard checking

Note: Context checking is enabled via use_context_checker=True parameter, not as a separate validation level.

How do I handle Zawgyi text?

from myspellchecker import SpellChecker
from myspellchecker.core.config import SpellCheckerConfig
from myspellchecker.core.config.validation_configs import ValidationConfig

# Zawgyi detection and conversion are enabled by default
# To explicitly configure:
config = SpellCheckerConfig(
    validation=ValidationConfig(
        use_zawgyi_detection=True,  # Detect Zawgyi encoding
        use_zawgyi_conversion=True,  # Convert to Unicode automatically
    )
)
checker = SpellChecker(config=config)

Can I use a custom dictionary?

Yes, you can build a custom dictionary from your own corpus:

myspellchecker build --input my_corpus.txt --output my_dict.db

Then use it:

from myspellchecker import SpellChecker
from myspellchecker.providers import SQLiteProvider

provider = SQLiteProvider(database_path="my_dict.db")
checker = SpellChecker(provider=provider)

How do I add words to the dictionary at runtime?

Currently, the dictionary is read-only at runtime. To add words:

Add them to your corpus file
Rebuild the dictionary
Restart your application

Runtime dictionary modification is planned for a future release.

Performance

How can I make spell checking faster?

Use syllable-level validation only (fastest):

from myspellchecker import SpellChecker
from myspellchecker.core.constants import ValidationLevel

checker = SpellChecker()
result = checker.check(text, level=ValidationLevel.SYLLABLE)

Disable context checking:

from myspellchecker.core.config import SpellCheckerConfig

config = SpellCheckerConfig(use_context_checker=False)
checker = SpellChecker(config=config)

Use batch processing:

results = checker.check_batch(texts)  # More efficient than individual calls

Ensure Cython is compiled:
```
python setup.py build_ext --inplace
```

Why is the first call slow?

The first call initializes the dictionary and loads models into memory. Subsequent calls are much faster. Consider warming up the checker:

checker = SpellChecker()
checker.check("test")  # Warm-up call

How much memory does mySpellChecker use?

Configuration	Memory Usage
Basic (SQLite provider)	~50MB
Memory provider	~200MB
With semantic model	~500MB
With transformer POS	~1GB

Can I use mySpellChecker in a multi-threaded application?

The SpellChecker instance is not thread-safe by default. For multi-threaded use:

Create separate instances per thread
Or use a connection pool for the SQLite provider
Or use the Memory provider (which is thread-safe for reads)

Accuracy

How accurate is mySpellChecker?

Accuracy depends on the validation level and corpus quality:

Level	Precision	Recall	F1
Syllable	95%	90%	92%
Word	92%	95%	93%
Word + Context Checker	88%	98%	93%

Why does it mark valid words as errors?

Common reasons:

Word not in dictionary: Add to custom dictionary
Rare spelling variant: Check corpus coverage
Foreign word: Myanmar text with English/Pali words
Proper noun: Names often flagged as unknown

Why doesn’t it catch obvious errors?

Common reasons:

Real-word error: The misspelling is a valid word (enable use_context_checker=True)
Validation level too low: Increase to word level
Missing grammar rules: Some patterns not covered

How can I improve accuracy?

Build from a larger corpus: More data = better suggestions
Enable context validation: Catches real-word errors
Use semantic checking: Requires AI extras
Report false positives/negatives: Help improve the library

Dictionary Building

How do I build a dictionary?

# From a text file
myspellchecker build --input corpus.txt --output mydict.db

# With POS tagging
myspellchecker build --input corpus.txt --pos-tagger viterbi

What format should my corpus be?

Plain text with Myanmar content:

မြန်မာနိုင်ငံ
ကျေးဇူးတင်ပါသည်
...

Or structured formats (CSV, JSON) with specific columns.

How big should my corpus be?

Corpus Size	Coverage	Recommendation
< 100KB	Basic	Testing only
100KB - 1MB	Good	Personal use
1MB - 10MB	Very Good	Production
> 10MB	Excellent	Professional

Can I use multiple corpora?

Yes, use incremental building:

myspellchecker build --input corpus1.txt --output dict.db
myspellchecker build --input corpus2.txt --output dict.db --incremental

Integration

Can I use mySpellChecker in a web application?

Yes, see the Integration Guide for FastAPI, Flask, and Django examples. The library provides check_async() for non-blocking use in async frameworks.

Is there a REST API?

The library doesn’t include a built-in API server, but you can easily create one:

from flask import Flask, request, jsonify
from myspellchecker import SpellChecker

app = Flask(__name__)
checker = SpellChecker()

@app.route('/check', methods=['POST'])
def check():
    text = request.json['text']
    result = checker.check(text)
    return jsonify(result.to_dict())

Can I use it with VS Code?

A VS Code extension is planned for future development. Currently, you can:

Use the CLI for manual checking
Create a custom script that integrates with your editor

Does it work with Django?

Yes, see the Integration Guide for Django service pattern and DRF examples.

Troubleshooting

Contributing

How can I contribute?

Report bugs via GitHub issues
Submit pull requests for fixes
Improve documentation
Share your custom dictionaries or corpora

Where is the roadmap?

Check the GitHub repository for planned features and roadmap.

How do I report a bug?

Open a GitHub issue with:

Python version
mySpellChecker version
Minimal reproduction code
Expected vs actual behavior

Getting Started

Dictionary Building

Spell Checking

Grammar

Language Processing

AI-Powered Checking

Text Utilities

Performance & Scale

Customization

Integration & Deployment

Help & FAQ

​General Questions

​What is mySpellChecker?

​Why was mySpellChecker created?

​Is mySpellChecker open source?

​What Python versions are supported?

​Installation

​Why does installation take a long time?

​Do I need a C++ compiler?

​How do I install on macOS without compiler errors?

​Can I install without the optional features?

​Usage

​How do I check Myanmar text?

​What is the difference between validation levels?

​How do I handle Zawgyi text?

​Can I use a custom dictionary?

​How do I add words to the dictionary at runtime?

​Performance

​How can I make spell checking faster?

​Why is the first call slow?

​How much memory does mySpellChecker use?

​Can I use mySpellChecker in a multi-threaded application?

​Accuracy

​How accurate is mySpellChecker?

​Why does it mark valid words as errors?

​Why doesn’t it catch obvious errors?

​How can I improve accuracy?

​Dictionary Building

​How do I build a dictionary?

​What format should my corpus be?

​How big should my corpus be?

​Can I use multiple corpora?

​Integration

​Can I use mySpellChecker in a web application?

​Is there a REST API?

​Can I use it with VS Code?

​Does it work with Django?

​Troubleshooting

​See Also

​Contributing

​How can I contribute?

​Where is the roadmap?

​How do I report a bug?

General Questions

What is mySpellChecker?

Why was mySpellChecker created?

Is mySpellChecker open source?

What Python versions are supported?

Installation

Why does installation take a long time?

Do I need a C++ compiler?

How do I install on macOS without compiler errors?

Can I install without the optional features?

Usage

How do I check Myanmar text?

What is the difference between validation levels?

How do I handle Zawgyi text?

Can I use a custom dictionary?

How do I add words to the dictionary at runtime?

Performance

How can I make spell checking faster?

Why is the first call slow?

How much memory does mySpellChecker use?

Can I use mySpellChecker in a multi-threaded application?

Accuracy

How accurate is mySpellChecker?

Why does it mark valid words as errors?

Why doesn’t it catch obvious errors?

How can I improve accuracy?

Dictionary Building

How do I build a dictionary?

What format should my corpus be?

How big should my corpus be?

Can I use multiple corpora?

Integration

Can I use mySpellChecker in a web application?

Is there a REST API?

Can I use it with VS Code?

Does it work with Django?

Troubleshooting

See Also

Contributing

How can I contribute?

Where is the roadmap?

How do I report a bug?