Docker Guide - mySpellChecker

Docker lets you run mySpellChecker without installing Python dependencies locally. The multi-stage Dockerfile provides optimized images for production API, development, and CLI-only use cases.

Quick Start

Using Docker Compose (Recommended)

# Start the API server (production)
docker compose up api

# Start development server with hot reload
docker compose --profile dev up dev

# Run CLI commands
docker compose --profile cli run --rm cli check "မြန်မာစာ"

# Run tests
docker compose --profile test run --rm test

Using Docker Directly

# Build the image
docker build -t myspellchecker:latest .

# Run API server
docker run -p 8000:8000 -v ./data:/app/data:ro myspellchecker:latest

# Run CLI
docker run --rm -v ./data:/app/data:ro myspellchecker:cli --help

Available Docker Images

The Dockerfile uses multi-stage builds to create optimized images for different use cases:

Target	Image Tag	Purpose	Size
`runtime`	`myspellchecker:latest`	Production API server	~200MB
`development`	`myspellchecker:dev`	Development with hot reload + tests	~350MB
`cli`	`myspellchecker:cli`	CLI-only (no web server)	~150MB

Building Specific Targets

# Build production image
docker build --target runtime -t myspellchecker:latest .

# Build development image
docker build --target development -t myspellchecker:dev .

# Build CLI-only image
docker build --target cli -t myspellchecker:cli .

Docker Compose Services

Production API (`api`)

docker compose up api

Port: 8000
Health check: http://localhost:8000/health
Resource limits: 2 CPUs, 1GB memory
Restart policy: unless-stopped

Development Server (`dev`)

docker compose --profile dev up dev

Port: 8000
Hot reload: Enabled (watches /app/src)
Log level: DEBUG
Profile: dev (requires --profile dev)

GPU-Enabled API (`api-gpu`)

For transformer-based POS tagging and AI features:

docker compose --profile gpu up api-gpu

Requires: NVIDIA GPU with Docker GPU support
Resource limits: 4 CPUs, 4GB memory, 1 GPU
Environment: CUDA_VISIBLE_DEVICES=0

CLI Tool (`cli`)

# Check text
docker compose --profile cli run --rm cli check "သွားပါမယ်"

# Build dictionary
docker compose --profile cli run --rm cli build --input /app/input/corpus.txt --output /app/output/dictionary.db

# Get help
docker compose --profile cli run --rm cli --help

Test Runner (`test`)

docker compose --profile test run --rm test

Runs pytest with coverage reporting.

Volume Mounts

Host Path	Container Path	Purpose
`./data`	`/app/data`	Dictionary database files
`./config`	`/app/config`	Custom configuration
`./input`	`/app/input`	Input files for CLI
`./output`	`/app/output`	Output files from CLI
`./src`	`/app/src`	Source code (dev only)

Example: Using Custom Dictionary

# Place your dictionary in ./data/
cp my_dictionary.db ./data/dictionary.db

# Start API with custom dictionary
docker compose up api

Environment Variables

Variable	Default	Description
`PYTHONPATH`	`/app/src`	Python module path
`LOG_LEVEL`	`INFO`	Logging level (DEBUG, INFO, WARNING, ERROR)
`CUDA_VISIBLE_DEVICES`	`0`	GPU device ID (gpu service only)

Custom Environment

# Override environment variables
LOG_LEVEL=DEBUG docker compose up api

# Or use .env file
echo "LOG_LEVEL=DEBUG" > .env
docker compose up api

Production Deployment

Basic Deployment

# Build production image
docker compose build api

# Run in detached mode
docker compose up -d api

# Check status
docker compose ps
docker compose logs api

With Reverse Proxy (nginx)

# docker-compose.override.yml
services:
  nginx:
    image: nginx:alpine
    ports:
      - "80:80"
      - "443:443"
    volumes:
      - ./nginx.conf:/etc/nginx/nginx.conf:ro
      - ./certs:/etc/nginx/certs:ro
    depends_on:
      - api

Health Monitoring

The API includes a health endpoint:

# Check health
curl http://localhost:8000/health

# Docker health check runs automatically every 30s
docker inspect --format='{{.State.Health.Status}}' myspellchecker-api

Building Dictionary in Docker

From Corpus File

# Place corpus in input directory
cp my_corpus.txt ./input/

# Build dictionary
docker compose --profile cli run --rm cli build \
  --input /app/input/my_corpus.txt \
  --output /app/output/dictionary.db

# Copy output to data directory
cp ./output/dictionary.db ./data/

Sample Dictionary

# Build sample dictionary for testing
docker compose --profile cli run --rm cli build --sample

# Output is saved to /app/data/dictionary.db

GPU Support

Prerequisites

NVIDIA GPU with CUDA support
NVIDIA Container Toolkit

Installation (Ubuntu)

# Add NVIDIA Container Toolkit repository
distribution=$(. /etc/os-release;echo $ID$VERSION_ID)
curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add -
curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | \
  sudo tee /etc/apt/sources.list.d/nvidia-docker.list

# Install
sudo apt-get update
sudo apt-get install -y nvidia-container-toolkit

# Restart Docker
sudo systemctl restart docker

Running with GPU

# Start GPU-enabled service
docker compose --profile gpu up api-gpu

# Verify GPU access
docker compose run --rm api-gpu python -c "import torch; print(torch.cuda.is_available())"

Troubleshooting

Container Won’t Start

# Check logs
docker compose logs api

# Check if port is in use
lsof -i :8000

# Rebuild image
docker compose build --no-cache api

Database Not Found

# Ensure data directory exists and contains dictionary
ls -la ./data/

# Build sample dictionary if needed
docker compose --profile cli run --rm cli build --sample

Permission Denied

# Fix volume permissions (Linux)
sudo chown -R 1000:1000 ./data ./output

# Or run as root (not recommended for production)
docker compose --profile cli run --rm --user root cli build --sample

GPU Not Detected

# Verify NVIDIA runtime
docker run --rm --gpus all nvidia/cuda:11.0-base nvidia-smi

# Check Docker GPU configuration
docker info | grep -i gpu

Memory Issues

# Increase memory limit in docker-compose.yml
services:
  api:
    deploy:
      resources:
        limits:
          memory: 2G  # Increase from 1G

Security Considerations

The Docker images follow security best practices:

Non-root user: Containers run as appuser (UID 1000)
Read-only mounts: Data volumes mounted as read-only where possible
Minimal base image: Uses python:slim for smaller attack surface
No secrets in image: Configuration via environment variables or mounted volumes
Health checks: Automatic container health monitoring

Running with Read-Only Filesystem

docker run --read-only \
  --tmpfs /tmp \
  -v ./data:/app/data:ro \
  myspellchecker:latest

Getting Started

Dictionary Building

Spell Checking

Grammar

Language Processing

AI-Powered Checking

Text Utilities

Performance & Scale

Customization

Integration & Deployment

Help & FAQ

​Quick Start

​Using Docker Compose (Recommended)

​Using Docker Directly

​Available Docker Images

​Building Specific Targets

​Docker Compose Services

​Production API (api)

​Development Server (dev)

​GPU-Enabled API (api-gpu)

​CLI Tool (cli)

​Test Runner (test)

​Volume Mounts

​Example: Using Custom Dictionary

​Environment Variables

​Custom Environment

​Production Deployment

​Basic Deployment

​With Reverse Proxy (nginx)

​Health Monitoring

​Building Dictionary in Docker

​From Corpus File

​Sample Dictionary

​GPU Support

​Prerequisites

​Installation (Ubuntu)

​Running with GPU

​Troubleshooting

​Container Won’t Start

​Database Not Found

​Permission Denied

​GPU Not Detected

​Memory Issues

​Security Considerations

​Running with Read-Only Filesystem

​See Also