javadoc2md is a tool for converting Java Javadoc HTML documentation into clean, context-rich Markdown files. It is designed to provide high-quality, deduplicated documentation context for AI code assistants and LLM-based developer tools.
- Downloads the Javadoc JAR from Maven Central (by groupId, artifactId, and optional version)
- Extracts only class documentation (removes headers/footers/notes/links)
- Converts to Markdown, preserving package structure
- Creates a structured directory with the documentation files
- Purpose-built for AI code assistants and LLMs: output is optimized for ingestion as context
- Python 3.8+
- beautifulsoup4
- markdownify
- tqdm
- requests
Clone the repository and synchronize dependencies:
git clone <repository-url>
cd javadoc2md
uv sync# Download and convert a specific version
javadoc2md --group com.google.guava --artifact guava --version 33.0.0-jre --output ./docs/
# Get the latest version
javadoc2md --group com.google.guava --artifact guava --output ./docs/This will create a directory structure like:
./docs/guava-33.0.0-jre/
├── com/
│ └── google/
│ └── common/
│ ├── collect/
│ │ ├── ImmutableList.md
│ │ └── Lists.md
│ └── base/
│ ├── Optional.md
│ └── Strings.md
└── ...
This tool is intended to generate context for AI code assistants, LLMs, and developer tools that require high-quality, deduplicated, and easily-parsable documentation.
MIT