Quick Start
StreamingConfig
| Option | Default | Description |
|---|---|---|
chunk_size | 100 | Lines processed per chunk |
max_memory_mb | 100 | Max memory before backpressure |
sentence_boundary_pattern | r"[။!?]+" | Regex for sentence boundaries |
enable_cross_sentence_context | True | Enable cross-sentence context |
progress_interval | 1000 | Lines between progress callbacks |
timeout_per_chunk | 30.0 | Async processing timeout (seconds) |
Configuration Presets
Synchronous Streaming
Processing Files
Processing Different Input Types
Sentence-by-Sentence Processing
Process text by sentences with context preservation:Cross-Sentence Context
Whenenable_cross_sentence_context=True, the checker validates using context from the previous sentence:
Async Streaming
Basic Async
Async with Timeout
Cancellation
Graceful Shutdown
Progress Tracking
Progress Callbacks
With tqdm Progress Bar
With Rich Progress Bar
StreamingStats
ChunkResult
Each iteration yields aChunkResult:
Memory Management
Backpressure
When memory exceedsmax_memory_mb, the streaming checker automatically applies backpressure:
- Sync mode: Triggers garbage collection and adds a small delay
- Async mode: Adds an async sleep to allow cleanup
Bounded Memory Usage
Error Handling
Sync Error Recovery
Async Timeout Recovery
Best Practices
- Reuse
StreamingCheckerinstances — creating a newSpellCheckerper file is expensive - Reset context between documents — call
streaming.reset_context()between unrelated files - Choose chunk size by use case — 500 for throughput, 1 for real-time
- Use async for I/O-bound workloads — network sources, file I/O with aiofiles
- Set appropriate validation level —
SYLLABLEfor speed,WORDfor thoroughness
Integration Examples
WebSocket Streaming
FastAPI File Upload
Batch File Processing
See Also
- Batch Processing -
check_batch()for multiple texts - Async API -
check_async()for web frameworks - Performance Tuning - Optimization strategies