Skip to content

Conversation

@mscrnt
Copy link

@mscrnt mscrnt commented Oct 23, 2025

Add Ollama support and content validation/cleaning tools

Summary

This PR adds support for local/self-hosted AI models via Ollama, provides a free OCR alternative with Tesseract.js, and includes tools for validating and cleaning transcribed content.

Key Features

🤖 Ollama Integration

  • Add Ollama as an alternative AI provider for vision-based transcription
  • Support for local/self-hosted vision models (qwen2.5vl, llama3.2-vision, etc.)
  • Configurable concurrency control (1-16 parallel requests)
  • Model warmup to prevent timeouts on first request

📝 Tesseract.js OCR Support

  • Free, local alternative to AI vision models
  • No API costs or rate limits
  • Runs entirely offline with configurable worker concurrency
  • Good accuracy for standard text layouts

✅ Content Validation & Cleaning

  • validate-content.ts: Automatically detect OCR errors
    • Identifies repetitive text patterns
    • Flags excessive punctuation or formatting issues
    • Generates detailed validation reports
  • clean-content-with-ollama.ts: Use Ollama text models to clean formatting
    • Fixes paragraph breaks and spacing
    • Preserves original words (no spelling/grammar changes)
    • Creates backup of original content

Bug Fixes

  • Fix 2FA/OTP submission timeout by trying multiple button selectors
  • Fix missing last chapter in PDF export

Configuration Changes

New environment variables:

  • AI_PROVIDER - Choose between openai or ollama
  • OLLAMA_BASE_URL - Ollama server endpoint
  • OLLAMA_VISION_MODEL - Vision model for transcription (e.g., qwen2.5vl:7b)
  • OLLAMA_MODEL - Text model for content cleaning (e.g., llama3.2)
  • OLLAMA_CONCURRENCY - Parallel request limit
  • OCR_CONCURRENCY - Tesseract.js worker count

- Add Ollama as an AI provider option for local/self-hosted vision models
- Add Tesseract.js OCR as a free, local alternative for transcription
- Add validation script to detect OCR errors (repetitions, excessive punctuation)
- Add content cleaning script using Ollama text models for formatting improvements
- Fix OTP submission issue with multiple selector fallbacks
- Replace ky with native fetch for better Node.js compatibility with Ollama
- Update README with comprehensive documentation for all new features
- Add all required environment variables to .env.example
@socket-security
Copy link

socket-security bot commented Oct 23, 2025

Review the following changes in direct dependencies. Learn more about Socket for GitHub.

Diff Package Supply Chain
Security
Vulnerability Quality Maintenance License
Addedtesseract.js@​5.1.19310010084100

View full report

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant