|
| 1 | +# Release Notes: LLM-Guard v0.4.1 |
| 2 | + |
| 3 | +> **AI Coding Hackathon Release** — Production-ready prompt injection detection with multi-provider LLM enrichment |
| 4 | +
|
| 5 | +--- |
| 6 | + |
| 7 | +## 🎯 Overview |
| 8 | + |
| 9 | +LLM-Guard v0.4.1 is a **fast, explainable Rust CLI** for detecting prompt injection and jailbreak attempts in LLM applications. This release delivers production-grade multi-provider LLM integration, enhanced detection rules, and comprehensive debug capabilities. |
| 10 | + |
| 11 | +**Developed in ~7 hours** during the [AI Coding Accelerator](https://maven.com/nila/ai-coding-accelerator) hackathon using AI-assisted development (GPT-5 Codex + Claude Code). |
| 12 | + |
| 13 | +--- |
| 14 | + |
| 15 | +## ✨ Key Features |
| 16 | + |
| 17 | +### Core Capabilities |
| 18 | +- ⚡ **Fast Scanning:** <100ms for typical prompts using Aho-Corasick + compiled regex |
| 19 | +- 📊 **Transparent Risk Scoring:** 0-100 scale with detailed rule attribution and text excerpts |
| 20 | +- 🔌 **Multi-Provider LLM Support:** OpenAI, Anthropic, Google Gemini, Azure OpenAI via `rig.rs` |
| 21 | +- 🏥 **Provider Health Checks:** Built-in diagnostics for validating connectivity and configuration |
| 22 | +- 📁 **Flexible Input Sources:** Files, stdin, streaming logs (tail mode) |
| 23 | +- 📤 **Multiple Output Formats:** Human-readable CLI or JSON for CI/CD automation |
| 24 | +- 🚦 **Exit Code Integration:** 0=low, 2=medium, 3=high, 1=error |
| 25 | + |
| 26 | +### Detection Coverage |
| 27 | +- **Instruction Override:** `INSTR_IGNORE`, `INSTR_OVERRIDE` patterns |
| 28 | +- **Data Exfiltration:** `PROMPT_LEAK` detection with flexible regex |
| 29 | +- **Policy Subversion:** `MODEL_OVERRIDE` jailbreak patterns |
| 30 | +- **Obfuscation Techniques:** `CODE_INJECTION` payload recognition |
| 31 | + |
| 32 | +--- |
| 33 | + |
| 34 | +## 🐛 Critical Fixes |
| 35 | + |
| 36 | +### Gemini Provider Integration |
| 37 | +**Problem:** Rig.rs deserialization errors (`missing field generationConfig`) and API rejection of function calling with JSON MIME type |
| 38 | +**Solution:** Bypassed rig entirely; implemented standalone HTTP client using Gemini's native REST API |
| 39 | +**Impact:** Gemini now fully functional with `generationConfig.responseMimeType: "application/json"` |
| 40 | + |
| 41 | +### OpenAI GPT-5 Reasoning Models |
| 42 | +**Problem:** Models returned only reasoning traces (no textual content) with `json_schema` response format |
| 43 | +**Solution:** Switched from strict `json_schema` to flexible `json_object` format |
| 44 | +**Impact:** Full compatibility with GPT-5 reasoning models; cleaner codebase |
| 45 | + |
| 46 | +### Detection Rules Gap |
| 47 | +**Problem:** Keyword "ignore previous instructions" missed variations like "ignore **your** previous instructions" |
| 48 | +**Solution:** Added flexible regex patterns `INSTR_IGNORE` and `PROMPT_LEAK` to `rules/patterns.json` |
| 49 | +**Impact:** Scanner now catches attack variations; heuristic and LLM verdicts align |
| 50 | + |
| 51 | +**Example:** |
| 52 | +``` |
| 53 | +Before: Risk Score: 0.0 (Low), No findings |
| 54 | +After: Risk Score: 37.5 (Medium) |
| 55 | + Findings: PROMPT_LEAK [40.0], INSTR_IGNORE [35.0] |
| 56 | +``` |
| 57 | + |
| 58 | +### Debug Logging Enhancement |
| 59 | +**Problem:** `--debug` flag only logged errors, not all raw LLM responses |
| 60 | +**Solution:** Added universal debug logging for all providers (raw response + extracted content) |
| 61 | +**Impact:** Easier diagnosis of parsing issues and provider behavior quirks |
| 62 | + |
| 63 | +--- |
| 64 | + |
| 65 | +## 📦 What's Included |
| 66 | + |
| 67 | +### Binaries |
| 68 | +```bash |
| 69 | +# Build from source |
| 70 | +cargo build --release |
| 71 | +./target/release/llm-guard --version # v0.4.1 |
| 72 | +``` |
| 73 | + |
| 74 | +### Configuration Files |
| 75 | +- `llm_providers.example.yaml` — Multi-provider config template |
| 76 | +- `rules/keywords.txt` — Exact-match keyword database |
| 77 | +- `rules/patterns.json` — Regex patterns for flexible detection |
| 78 | + |
| 79 | +### Documentation |
| 80 | +- `README.md` — Complete project overview with hackathon context |
| 81 | +- `docs/USAGE.md` — Comprehensive CLI reference |
| 82 | +- `docs/TESTING_GUIDE.md` — Testing protocols and provider health checks |
| 83 | +- `AGENTS.md` — AI assistant onboarding guide |
| 84 | +- `PLAN.md` — Implementation roadmap and phase tracking |
| 85 | +- `PROJECT_SUMMARY.md` — Current state snapshot |
| 86 | + |
| 87 | +--- |
| 88 | + |
| 89 | +## 🚀 Quick Start |
| 90 | + |
| 91 | +### Installation |
| 92 | +```bash |
| 93 | +git clone https://github.com/HendrikReh/llm-guard |
| 94 | +cd llm-guard |
| 95 | +cargo build --release |
| 96 | +``` |
| 97 | + |
| 98 | +### Basic Usage |
| 99 | +```bash |
| 100 | +# Scan a file |
| 101 | +./target/release/llm-guard scan --file examples/chat.txt |
| 102 | + |
| 103 | +# LLM-enhanced scan with Gemini |
| 104 | +export LLM_GUARD_PROVIDER=gemini |
| 105 | +export LLM_GUARD_API_KEY=your_key_here |
| 106 | +./target/release/llm-guard scan --file examples/chat.txt --with-llm |
| 107 | + |
| 108 | +# Debug mode (dump raw responses) |
| 109 | +./target/release/llm-guard scan --file examples/chat.txt --with-llm --debug |
| 110 | + |
| 111 | +# Provider health check |
| 112 | +./target/release/llm-guard health --providers-config llm_providers.yaml |
| 113 | +``` |
| 114 | + |
| 115 | +### CI/CD Integration |
| 116 | +```bash |
| 117 | +# Generate JSON output |
| 118 | +./target/release/llm-guard scan --file input.txt --json > report.json |
| 119 | + |
| 120 | +# Exit codes: 0=low, 2=medium, 3=high, 1=error |
| 121 | +if [ $? -ge 2 ]; then |
| 122 | + echo "Security risk detected!" |
| 123 | + exit 1 |
| 124 | +fi |
| 125 | +``` |
| 126 | + |
| 127 | +--- |
| 128 | + |
| 129 | +## 🔧 Configuration |
| 130 | + |
| 131 | +### Environment Variables |
| 132 | +| Variable | Description | Default | |
| 133 | +|----------|-------------|---------| |
| 134 | +| `LLM_GUARD_PROVIDER` | Provider (`openai`, `anthropic`, `gemini`, `azure`) | `openai` | |
| 135 | +| `LLM_GUARD_API_KEY` | API key/token | - | |
| 136 | +| `LLM_GUARD_MODEL` | Model name (e.g., `gpt-4o-mini`) | Provider default | |
| 137 | +| `LLM_GUARD_ENDPOINT` | Custom endpoint URL | Provider default | |
| 138 | +| `LLM_GUARD_TIMEOUT_SECS` | HTTP timeout in seconds | `30` | |
| 139 | +| `LLM_GUARD_MAX_RETRIES` | Retry count for failed calls | `2` | |
| 140 | + |
| 141 | +### Provider Profiles (`llm_providers.yaml`) |
| 142 | +```yaml |
| 143 | +providers: |
| 144 | + - name: "openai" |
| 145 | + api_key: "OPENAI_API_KEY" |
| 146 | + model: "gpt-4o-mini" |
| 147 | + - name: "gemini" |
| 148 | + api_key: "GEMINI_API_KEY" |
| 149 | + model: "gemini-1.5-flash" |
| 150 | + - name: "azure" |
| 151 | + api_key: "AZURE_OPENAI_KEY" |
| 152 | + endpoint: "https://your-resource.openai.azure.com" |
| 153 | + deployment: "gpt-4o-production" |
| 154 | + api_version: "2024-02-15-preview" |
| 155 | +``` |
| 156 | +
|
| 157 | +**Configuration Precedence:** CLI flags → Environment variables → Provider profile |
| 158 | +
|
| 159 | +--- |
| 160 | +
|
| 161 | +## 📊 Technical Metrics |
| 162 | +
|
| 163 | +| Metric | Value | |
| 164 | +|--------|-------| |
| 165 | +| **Lines of Code** | ~4,000 (Rust) | |
| 166 | +| **Source Files** | 25 `.rs` files | |
| 167 | +| **Test Coverage** | 44 tests (34 passing, 10 ignored) | |
| 168 | +| **Dependencies** | Production-grade (tokio, reqwest, rig, clap) | |
| 169 | +| **Detection Rules** | 4 patterns + keyword database | |
| 170 | +| **Supported Providers** | 4 (OpenAI, Anthropic, Gemini, Azure) | |
| 171 | +| **Performance** | <100ms for typical prompts | |
| 172 | + |
| 173 | +--- |
| 174 | + |
| 175 | +## 🧪 Testing |
| 176 | + |
| 177 | +```bash |
| 178 | +# Run all tests |
| 179 | +cargo test |
| 180 | +
|
| 181 | +# Run library tests only |
| 182 | +cargo test --lib |
| 183 | +
|
| 184 | +# Run with ignored tests (requires network) |
| 185 | +cargo test -- --include-ignored |
| 186 | +
|
| 187 | +# Provider health checks |
| 188 | +cargo run --bin llm-guard-cli -- health --providers-config llm_providers.yaml |
| 189 | +``` |
| 190 | + |
| 191 | +See [`docs/TESTING_GUIDE.md`](./docs/TESTING_GUIDE.md) for comprehensive testing protocols. |
| 192 | + |
| 193 | +--- |
| 194 | + |
| 195 | +## 🤖 AI-Assisted Development |
| 196 | + |
| 197 | +This release demonstrates the capabilities of **AI-assisted software development**: |
| 198 | + |
| 199 | +**Workflow:** |
| 200 | +- **Primary Agent:** GPT-5 Codex (core logic, LLM adapters, CLI) |
| 201 | +- **Review Agent:** Claude Code (code reviews, documentation, debugging) |
| 202 | +- **Context Management:** RepoPrompt + Context7 MCP servers |
| 203 | + |
| 204 | +**What Worked:** |
| 205 | +- ✅ Functional CLI with 4 LLM providers in <7 hours |
| 206 | +- ✅ Multi-agent collaboration (coding vs. review separation) |
| 207 | +- ✅ MCP integration eliminated manual file navigation |
| 208 | +- ✅ PRD-driven development prevented scope creep |
| 209 | + |
| 210 | +**Challenges:** |
| 211 | +- ⚠️ Provider API quirks (Gemini, OpenAI reasoning models) |
| 212 | +- ⚠️ Testing gaps due to time pressure (10 ignored tests) |
| 213 | +- ⚠️ Rig.rs limitations required Gemini bypass |
| 214 | + |
| 215 | +--- |
| 216 | + |
| 217 | +## 🔮 Known Limitations |
| 218 | + |
| 219 | +- **Rule Coverage:** Only 4 detection patterns (expandable via `rules/patterns.json`) |
| 220 | +- **Context Windows:** Limited to 200-char proximity for synergy bonuses |
| 221 | +- **Test Coverage:** 10 tests ignored (require network or specific environments) |
| 222 | +- **Production Readiness:** Prototype for research/education; not audited for production security workloads |
| 223 | + |
| 224 | +--- |
| 225 | + |
| 226 | +## 📚 Resources |
| 227 | + |
| 228 | +- **Main Documentation:** [README.md](./README.md) |
| 229 | +- **Usage Reference:** [docs/USAGE.md](./docs/USAGE.md) |
| 230 | +- **Testing Guide:** [docs/TESTING_GUIDE.md](./docs/TESTING_GUIDE.md) |
| 231 | +- **Implementation Plan:** [PLAN.md](./PLAN.md) |
| 232 | +- **AI Onboarding:** [AGENTS.md](./AGENTS.md) |
| 233 | +- **Project Summary:** [PROJECT_SUMMARY.md](./PROJECT_SUMMARY.md) |
| 234 | + |
| 235 | +--- |
| 236 | + |
| 237 | +## 🙏 Acknowledgments |
| 238 | + |
| 239 | +**Hackathon:** [AI Coding Accelerator](https://maven.com/nila/ai-coding-accelerator) (Maven) |
| 240 | +**Instructors:** [Vignesh Mohankumar](https://x.com/vig_xyz), [Jason Liu](https://x.com/jxnlco) |
| 241 | + |
| 242 | +**Built with:** |
| 243 | +- [Cursor](https://cursor.sh) + GPT-5 Codex |
| 244 | +- [Claude Code](https://claude.ai) |
| 245 | +- [RepoPrompt MCP](https://repoprompt.com/) |
| 246 | +- [Context7 MCP](https://context7.com/) |
| 247 | + |
| 248 | +--- |
| 249 | + |
| 250 | +## 📄 License |
| 251 | + |
| 252 | +Apache-2.0 OR MIT |
| 253 | + |
| 254 | +**Security Disclaimer:** This tool is a prototype for research/education. Use at your own risk. |
| 255 | + |
| 256 | +**AI Development Notice:** Codebase primarily generated via AI assistants (GPT-5 Codex, Claude Code) with human oversight for architecture, testing, and quality validation. |
| 257 | + |
| 258 | +--- |
| 259 | + |
| 260 | +## 🔗 Links |
| 261 | + |
| 262 | +- **Repository:** https://github.com/HendrikReh/llm-guard |
| 263 | +- **Issues:** https://github.com/HendrikReh/llm-guard/issues |
| 264 | +- **Releases:** https://github.com/HendrikReh/llm-guard/releases |
| 265 | + |
| 266 | +--- |
| 267 | + |
| 268 | +**Full Changelog:** https://github.com/HendrikReh/llm-guard/compare/v0.4.0...v0.4.1 |
0 commit comments