Skip to main content
Each DictionaryProvider implementation supports a different subset of features depending on its underlying storage mechanism. Use the matrix below to choose the right provider for your use case.

Overview

The DictionaryProvider abstract base class defines the interface for dictionary data storage and retrieval. Different implementations offer different capabilities based on their underlying storage mechanism.

Capability Matrix

MethodSQLiteProviderMemoryProviderJSONProviderCSVProvider
Core Validation
is_valid_syllable()✅ Full✅ Full✅ Full✅ Full
is_valid_word()✅ Full✅ Full✅ Full✅ Full
has_syllable()✅ Full✅ Full✅ Full✅ Full
has_word()✅ Full✅ Full✅ Full✅ Full
Frequency Data
get_syllable_frequency()✅ Full✅ Full✅ Full✅ Full
get_word_frequency()✅ Full✅ Full✅ Full✅ Full
Part-of-Speech
get_word_pos()✅ Full✅ Full⚠️ Optional⚠️ Optional
get_pos_unigram_probabilities()✅ Full✅ Full❌ Not Supported❌ Not Supported
get_pos_bigram_probabilities()✅ Full✅ Full❌ Not Supported❌ Not Supported
get_pos_trigram_probabilities()✅ Full✅ Full❌ Not Supported❌ Not Supported
N-gram Context
get_bigram_probability()✅ Full✅ Full❌ Not Supported❌ Not Supported
get_trigram_probability()✅ Full✅ Full❌ Not Supported❌ Not Supported
get_top_continuations()✅ Full✅ Full❌ Not Supported❌ Not Supported
Morphology
get_word_syllable_count()❌ Not Supported❌ Not Supported✅ If provided✅ If provided
Iteration
get_all_syllables()✅ Full✅ Full✅ Full✅ Full
get_all_words()✅ Full✅ Full✅ Full✅ Full
Bulk Operations
is_valid_syllables_bulk()✅ Optimized✅ OptimizedDefaultDefault
is_valid_words_bulk()✅ Optimized✅ OptimizedDefaultDefault
get_syllable_frequencies_bulk()✅ Optimized✅ OptimizedDefaultDefault
get_word_frequencies_bulk()✅ Optimized✅ OptimizedDefaultDefault
get_word_pos_bulk()✅ Optimized✅ OptimizedDefaultDefault

Legend

  • Full: Fully implemented with optimized performance
  • ⚠️ Optional: Supported if data is provided during initialization
  • Not Supported: Returns default value (0, None, or empty)
  • Default: Uses default implementation (iterates over individual calls)
  • Optimized: Uses batch queries for better performance

Provider Selection Guide

Best for:
  • Production deployments
  • Large dictionaries (100K+ entries)
  • Concurrent access from multiple threads
  • Full spell checking with context validation
Features:
  • Disk-based storage with memory-mapped I/O
  • Connection pooling for thread safety
  • Optimized batch queries
  • Full N-gram and POS support
from myspellchecker.providers import SQLiteProvider

provider = SQLiteProvider(
    database_path="myspell.db",
    cache_size=2048,
    pool_max_size=5,
)

MemoryProvider (Best for Testing/Development)

Best for:
  • Unit testing with controlled data
  • Development and debugging
  • Small dictionaries
  • Maximum performance (no I/O)
Features:
  • In-memory storage
  • Fast initialization
  • Full N-gram and POS support
  • No disk dependencies
from myspellchecker.providers import MemoryProvider

provider = MemoryProvider(
    syllables={"မြန်": 1000, "မာ": 500},
    words={"မြန်မာ": 800},
    bigrams={("မြန်", "မာ"): 0.5},
)

JSONProvider (For Simple Use Cases)

Best for:
  • Simple dictionary files
  • Human-readable configuration
  • Small datasets
  • Testing with external data
Limitations:
  • No N-gram support
  • No POS probability support
  • Not optimized for large datasets
from myspellchecker.providers import JSONProvider

provider = JSONProvider(json_path="dictionary.json")

CSVProvider (For Data Import)

Best for:
  • Importing from spreadsheets
  • Simple frequency lists
  • Data migration
Limitations:
  • Same as JSONProvider
  • Column-based format only
from myspellchecker.providers import CSVProvider

provider = CSVProvider(csv_path="dictionary.csv")

Method Behavior When Not Supported

When a method is not supported by a provider, it returns a safe default value:
MethodDefault Return Value
get_bigram_probability()0.0
get_trigram_probability()0.0
get_top_continuations()[] (empty list)
get_word_pos()None
get_word_syllable_count()None
get_pos_*_probabilities(){} (empty dict)

Creating Custom Providers

To create a custom provider, extend DictionaryProvider and implement all abstract methods:
from myspellchecker.providers import DictionaryProvider

class MyCustomProvider(DictionaryProvider):
    def is_valid_syllable(self, syllable: str) -> bool:
        # Your implementation
        pass

    def is_valid_word(self, word: str) -> bool:
        # Your implementation
        pass

    # ... implement all abstract methods
See src/myspellchecker/providers/base.py for the complete interface definition.