-
Notifications
You must be signed in to change notification settings - Fork 0
FEATURES
Version: 1.0
Stand: November 2025
Status-Legende: ✅ Production-Ready | 🔧 Beta | 📋 Geplant
ThemisDB ist eine Multi-Model Database mit ACID-Garantien, die relationale, Graph-, Vektor- und Dokument-Datenmodelle in einem einheitlichen System vereint. Basierend auf RocksDB (LSM-Tree) mit erweiterter Sicherheits- und Compliance-Architektur.
Kernmerkmale:
- 🔒 ACID-Transaktionen mit MVCC (Snapshot Isolation)
- 🔍 Multi-Model Support (Relational, Graph, Vector, Document)
- 🚀 High-Performance (45K writes/s, 120K reads/s)
- 🛡️ Enterprise Security (TLS 1.3, RBAC, Verschlüsselung, Audit)
- 📊 Advanced Query Language (AQL mit Graph-Traversals, Aggregationen)
- 🌐 Production-Ready (85%+ Test Coverage, Comprehensive Monitoring)
Status: Production-Ready | Docs: docs/architecture/base_entity.md
- Base Entity - Unified JSON/Binary blob storage für alle Datenmodelle
- RocksDB TransactionDB - LSM-Tree mit ACID-Garantien
- VelocyPack/Bincode - High-Performance Serialization
- Multi-Format Support - JSON, Binary, Custom Formats
- Fast Field Extraction - Optimierte Parsing-Pipeline
Key Features:
- Atomic updates über alle Index-Layer
- Write-optimiert (append-only LSM-Tree)
- Configurable compression (LZ4, ZSTD, Snappy)
- BlobDB support für große Objekte
Status: Production-Ready
| Modell | Logical Entity | Physical Storage | Key Format |
|---|---|---|---|
| Relational | Row | (PK, Blob) | table:pk |
| Document | JSON Document | (PK, Blob) | collection:pk |
| Graph (Nodes) | Vertex | (PK, Blob) | node:pk |
| Graph (Edges) | Edge | (PK, Blob) | edge:pk |
| Vector | Embedding Object | (PK, Blob) | object:pk |
Status: Production-Ready | Docs: docs/storage/CLOUD_BLOB_BACKENDS.md
- Filesystem Backend - Hierarchische lokale Speicherung
- WebDAV/ActiveDirectory - SharePoint & Enterprise Integration
- S3 Compatible - Interface ready (AWS, MinIO, etc.)
- Azure Blob - Interface ready
- Threshold-basierte Selektion - Automatische Backend-Wahl
- SHA256 Content Hashing - Deduplizierung & Integrität
Status: Production-Ready | Docs: docs/features/indexes.md
Index-Typen:
- ✅ Single-Column - Equality-basierte Suche
- ✅ Composite - Multi-Spalten-Indizes
- ✅ Range - Bereichsabfragen (>, <, BETWEEN)
- ✅ Sparse - Nur für existierende Werte
- ✅ Geo-Spatial - R-Tree für räumliche Suche
- ✅ TTL (Time-To-Live) - Automatisches Expiration
- ✅ Full-Text - Inverted Index für Textsuche
Features:
- Automatic index maintenance mit MVCC
- Thread-safe operations
- Index statistics & cardinality estimation
- Rebuild & reindex operations
- Performance metrics
API:
POST /index/create
{ "table": "users", "column": "age", "type": "range" }Status: Production-Ready | Docs: docs/features/recursive_path_queries.md
Index-Strukturen:
-
Outdex - Ausgehende Kanten (
graph:out:node:edge) -
Indeg - Eingehende Kanten (
graph:in:node:edge) - Type-Aware - Server-side Kantentyp-Filterung
- Property Storage - Edge properties mit Gewichtung
Algorithmen:
- ✅ BFS (Breadth-First Search) - Tiefenbegrenzte Traversierung
- ✅ Dijkstra - Kürzeste Pfade (gewichtet)
- ✅ A* - Heuristische Pfadsuche
- ✅ Recursive Path Queries - Variable Tiefe (1-N hops)
- ✅ Temporal Graph Queries - Zeitbereichs-Filter
Path Constraints:
- Last-Edge Constraints
- No-Vertex Repetition
- Type-based Pruning
Status: Production-Ready | Docs: docs/features/vector_ops.md
HNSW Index:
- ✅ Persistent HNSW - Crash-safe, transactional
- ✅ Distance Metrics - L2, Cosine, Dot Product
- ✅ Batch Operations - Insert 500-1000 vectors
- ✅ KNN Search - Approximate Nearest Neighbors
- ✅ Configurable Parameters - M, efConstruction, efSearch
Performance:
- Throughput: 1,800 queries/s (CPU)
- Latency: p50 = 0.55ms, p99 = 2.1ms
- GPU Acceleration planned (50K+ q/s)
API:
POST /vector/search
{ "vector": [0.1, 0.2, ...], "k": 10, "metric": "cosine" }Status: Production-Ready | Docs: docs/aql/syntax.md
Syntax-Konstrukte:
- ✅ FOR/FILTER/SORT/LIMIT/RETURN - SQL-ähnliche Semantik
- ✅ Graph Traversals -
FOR v,e,p IN 1..3 OUTBOUND start - ✅ COLLECT/GROUP BY - Aggregationen (COUNT, SUM, AVG, MIN, MAX)
- ✅ Subqueries - Nested queries mit IN/ALL/ANY
- ✅ Pattern Matching - Graph pattern expressions
- ✅ Temporal Filters - Zeitbereichs-Abfragen
Query Optimizer:
- ✅ Cost-Based - Index selection, predicate ordering
- ✅ EXPLAIN - Execution plan visualization
- ✅ PROFILE - Runtime metrics & bottleneck analysis
- ✅ Parallelization - Intel TBB task-based execution
Metriken (PROFILE):
-
edges_expanded- Graph traversal expansion rate -
prune_last_level- Pruning effectiveness -
index_scan_cost- Index operation costs
Status: Production-Ready (Phase 4) | Docs: docs/apis/hybrid_search_api.md
Pre-Filtering:
- Relational predicate → Candidate bitset
- Vector HNSW search über filtered candidates
- Graph expansion mit constraints
Post-Filtering:
- Global vector search → Top-K results
- Relational/Graph filters auf result set
Use Cases:
- "Finde ähnliche Dokumente (vector) aus Abteilung X (relational) mit Tag Y (graph)"
- Fusion von Similarity, Metadata und Relationships
Status: Production-Ready (85% Coverage) | Docs: docs/security/implementation_summary.md
- TLS 1.3 default (TLS 1.2 fallback)
- Strong Ciphers - ECDHE-RSA-AES256-GCM-SHA384, ChaCha20-Poly1305
- mTLS - Client certificate verification
-
HSTS Headers -
max-age=31536000; includeSubDomains - Certificate Pinning - SHA256 fingerprints für HSM/TSA
- Token Bucket Algorithm - 100 req/min default
- Per-IP & Per-User Limits - Configurable thresholds
- HTTP 429 Responses - Retry-After headers
- Metrics - Real-time monitoring
- JSON Schema Validation - Strict type checking
- AQL Injection Prevention - Parameterized queries
- Path Traversal Protection - Sanitized file paths
- Max Body Size - 10MB default limit
X-Frame-Options: DENYX-Content-Type-Options: nosniffX-XSS-Protection: 1; mode=block-
Content-Security-Policy- Configurable - CORS Whitelisting - Strict origin control
Status: Production-Ready | Docs: docs/security/implementation_summary.md
Role Hierarchy:
admin → operator → analyst → readonly
Permissions:
-
data:read,data:write,data:delete -
keys:rotate,keys:view -
audit:view,audit:export config:modify- Wildcard support:
*:*
Features:
- JSON/YAML configuration
- User-role mapping store
- Resource-based access control
Status: Production-Ready | Docs: docs/security/column_encryption.md
- AES-256-GCM - Authenticated encryption
- Transparent Operations - App-level abstraction
- Schema-Based - Selective field encryption
- Index Compatibility - Encrypted fields können indexiert werden
Key Management:
- ✅ MockKeyProvider - Development/Testing
- ✅ HSMKeyProvider - PKCS#11 HSM integration
- ✅ VaultKeyProvider - HashiCorp Vault
Key Rotation:
- ✅ Lazy Re-Encryption - Zero-downtime rotation
- ✅ Transparent Migration - Gradual re-encryption
- ✅ Audit Trail - Rotation tracking
API:
PUT /config/encryption-schema
{
"fields": {
"ssn": { "encrypted": true, "algorithm": "AES-256-GCM" }
}
}- Encrypt-then-Sign - Confidentiality + Integrity
- Hash Chain - Tamper-detection (Merkle-like)
- PKI Signatures - RSA-SHA256 (eIDAS-konform)
Status: Production-Ready | Docs: docs/security/implementation_summary.md
HashiCorp Vault Integration:
- ✅ KV v2 Engine - Secret storage
- ✅ AppRole Auth - Service authentication
- ✅ Auto Token Renewal - Lease management
- ✅ Rotation Callbacks - Dynamic secret updates
- ✅ Environment Fallback - Development mode
Status: Production-Ready | Docs: docs/features/audit_logging.md
Event Types (65+):
-
LOGIN_FAILED,PRIVILEGE_ESCALATION_ATTEMPT -
DATA_ACCESS,DATA_MODIFIED,DATA_DELETED -
KEY_ROTATED,ENCRYPTION_FAILED -
UNAUTHORIZED_ACCESS,SCHEMA_CHANGED
Features:
- ✅ Severity Levels - HIGH, MEDIUM, LOW
- ✅ SIEM Integration - Syslog RFC 5424, Splunk HEC
- ✅ Tamper-Proof - Hash chain verification
- ✅ Retention Policies - Auto-archival & purging
API:
GET /audit/logs?severity=HIGH&from=2025-01-01Status: Production-Ready | Docs: docs/features/compliance.md
GDPR/DSGVO:
- ✅ Recht auf Löschung (Deletion API)
- ✅ Recht auf Auskunft (Data export)
- ✅ Pseudonymisierung (Field encryption)
- ✅ Data classification (4 Stufen: offen/vs-nfd/geheim/streng_geheim)
SOC 2 Controls:
- ✅ CC6.1 - Access Control (RBAC)
- ✅ CC6.7 - Audit Logs
- ✅ CC7.2 - Change Management
HIPAA:
- ✅ §164.312(a)(1) - Access Control
- ✅ §164.312(e)(1) - Transmission Security (TLS 1.3)
PII Detection (7 Typen):
- ✅ Email, Phone, SSN, Credit Card, IBAN, IP, URL
- ✅ Automatic pattern recognition
- ✅ YAML-configurable rules
Status: Production-Ready | Docs: docs/features/multi_tenancy.md
Features:
- ✅ Tenant Lifecycle - Create, Update, Delete, Enable/Disable
- ✅ Tenant Identification - Header-based (
X-Tenant-ID), Path-based - ✅ Resource Quotas - Storage, Documents, Collections, Queries, Connections
- ✅ Rate Limiting - Per-tenant requests/sec with burst control
- ✅ Feature Flags - GPU, Vector, Graph, Timeseries, Geo, Full-Text
- ✅ Encryption - Tenant-specific keys, optional mandatory encryption
- ✅ Usage Tracking - Storage, Documents, Requests, Bandwidth
- ✅ Billing Integration - Prometheus metrics export
- ✅ Data Isolation - Complete tenant separation
Status: Production-Ready | Docs: docs/features/time_series.md
Features:
- ✅ Gorilla Compression - 10-20x compression ratio
- ✅ Continuous Aggregates - Pre-computed rollups (360-3600x speedup)
- ✅ Retention Policies - Auto-expiration
- ✅ Downsampling - Multi-resolution storage
- ✅ Aggregate Scheduler - Automatic background refresh
- ✅ Query Optimizer - Cost-based aggregate rewriting
Performance:
- 22/22 tests passing
- Sub-millisecond query latency (with aggregates)
- Efficient storage for metrics/logs
Status: Production-Ready | Docs: docs/features/olap_analytics.md
Features:
- ✅ Aggregations - COUNT, SUM, AVG, MIN, MAX, STDDEV, VARIANCE, MEDIAN, PERCENTILE
- ✅ Grouping Operators - CUBE, ROLLUP, GROUPING SETS
- ✅ Window Functions - PARTITION BY, ORDER BY, ROWS/RANGE frames
- ✅ Columnar Store - Vektorisierte Aggregationen
- ✅ Materialized Views - Pre-computed aggregations
Window Functions:
- ROW_NUMBER, RANK, DENSE_RANK
- LAG, LEAD
- FIRST_VALUE, LAST_VALUE
- NTILE
Status: Production-Ready | Docs: docs/features/temporal_graphs.md
Features:
- ✅ Temporal Filters -
valid_from,valid_to - ✅ Snapshot Queries - Point-in-time graph state
- ✅ Time-Range Aggregations - Edge property rollups
- ✅ Type-Aware Traversal - Filter by edge type + timestamp
API:
aggregateEdgePropertyInTimeRange(
"user123", "FOLLOWS", "timestamp",
from_ts, to_ts, AggregationType::COUNT
)Status: Production-Ready (27/27 tests) | Docs: docs/architecture/mvcc_design.md
Features:
- ✅ Snapshot Isolation - Consistent reads
- ✅ Write-Write Conflict Detection - Automatic rollbacks
- ✅ Atomic Updates - Across all index layers
- ✅ Optimistic Concurrency - High throughput
Guarantees:
- Atomicity - All-or-nothing commits
- Consistency - Blob + Indexes transactional
- Isolation - Read Committed / Snapshot
- Durability - WAL-based recovery
Status: Production-Ready | Docs: docs/features/transactions.md
Features:
- ✅ Session-Based Transactions - Long-lived sessions
- ✅ Multi-Index Support - Secondary, Graph, Vector
- ✅ Isolation Levels -
read_committed,snapshot - ✅ Statistics - Success rate, durations
API:
POST /transaction/begin
POST /transaction/commit
POST /transaction/rollback
GET /transaction/statsStatus: Production-Ready | Docs: docs/features/change_data_capture.md
Features:
- ✅ Append-Only Event Log - All mutations captured
- ✅ Incremental Consumption - Checkpointing
- ✅ SSE Streaming - Real-time event delivery (experimental)
- ✅ Backpressure Handling - Flow control
- ✅ Retention Policies - Configurable TTL
Event Types:
-
INSERT,UPDATE,DELETE - Full entity snapshots
- Metadata (timestamp, user, transaction)
API:
GET /cdc/events?since=checkpoint_123Status: Production-Ready | Docs: docs/performance/memory_tuning.md
Storage Hierarchy:
- WAL on NVMe - Minimum commit latency
- Memtable in RAM - Fast ingestion
- Block Cache (RAM) - Hot data caching (configurable size)
- Bloom Filters (RAM) - Probabilistic key existence checks
- SSTables on SSD - Persistent storage (LZ4/ZSTD compressed)
Configuration:
storage:
memtable_size_mb: 256
block_cache_size_mb: 1024
compression:
default: lz4
bottommost: zstdStatus: Production-Ready | Docs: docs/performance/compression_benchmarks.md
Algorithms:
- LZ4 - Balanced (33.8 MB/s write, 2.1x compression)
- ZSTD - Space-optimized (32.3 MB/s write, 2.8x compression)
- Snappy - Alternative option
Strategie:
- LZ4 für upper levels (schneller)
- ZSTD für bottommost level (besser komprimiert)
Status: Production-Ready | Docs: docs/performance/TBB_INTEGRATION.md
Intel TBB Integration:
- ✅ Task-Based Execution - Work-stealing scheduler
- ✅ Batch Processing - Parallel entity loading (batch size: 50)
- ✅ Index Scans - Parallel predicate evaluation
- ✅ Throughput - 3.5x speedup on 8-core systems
Status: Production-Ready | Docs: docs/performance/GPU_ACCELERATION_PLAN.md
CUDA Backend:
- ✅ Faiss GPU Integration
- ✅ Vector distance computation (10-50x speedup)
- ✅ Batch queries (50K-100K q/s)
Vulkan Backend:
- ✅ Cross-platform GPU compute
- ✅ Multi-vendor support (NVIDIA, AMD, Intel)
- ✅ Compute shaders for vector operations
Status: Production-Ready | Docs: docs/apis/openapi.md
Core Endpoints:
- ✅ Entities:
PUT/GET/DELETE /entities/{key} - ✅ Indexes:
POST /index/create,POST /index/drop - ✅ Queries:
POST /query(relational),POST /query/aql(AQL) - ✅ Graph:
POST /graph/traverse - ✅ Vector:
POST /vector/search - ✅ Transactions:
POST /transaction/* - ✅ Admin:
POST /admin/backup,GET /admin/stats - ✅ Monitoring:
GET /health,GET /stats,GET /metrics
Content-Type:
-
application/json(primary) -
application/x-velocypack(optional)
Status: Production-Ready | File: docs/openapi.yaml
- Complete API documentation
- Request/Response schemas
- Authentication schemes
- Error codes
Status: Production-Ready | Docs: docs/apis/graphql.md
- ✅ GraphQL Parser - Query, Mutation, Subscription
- ✅ Schema Introspection - SDL Export
- ✅ Field Resolution - Nested selections
- ✅ Built-in Types - Document, Graph, Vector, Timeseries
- ✅ Error Handling - GraphQL spec compliant
- ✅ HTTP Endpoint -
POST /graphql
Status: Alpha → Beta | Docs: clients/
JavaScript/TypeScript SDK:
- 🔧 Basic HTTP wrapper (Alpha)
- 📋 TypeScript definitions
- 📋 Transaction support
- 📋 Comprehensive tests
Python SDK:
- 🔧 Basic HTTP wrapper (Alpha)
- 📋 Type hints
- 📋 Async/await support
- 📋 PyPI package
Go SDK: 📋 Planned
Rust SDK: 📋 Planned
.NET SDK: 📋 Planned
Status: Production-Ready | Docs: docs/architecture/content_architecture.md
Unified Ingestion Pipeline:
- ✅ ContentTypeRegistry - MIME type detection
- ✅ Processor Routing - Domain-specific handlers
- ✅ Metadata Extraction - EXIF, GPS, Tags
- ✅ Chunking - Configurable strategies
Processors:
- ✅ Image Processor - EXIF, thumbnails, 3x3 tile chunking
- ✅ Geo Processor - GeoJSON, GPX parsing & normalization
- 📋 PDF Processor - Text extraction (planned)
- 📋 Office Processor - DOCX, XLSX (planned)
API:
POST /content/import
{
"content": {...},
"chunks": [...],
"edges": [...],
"blob": "..."
}Status: Production-Ready | Docs: docs/geo/
Capabilities:
- ✅ R-Tree Index - Spatial search
- ✅ Geohash - Location encoding
- ✅ GeoJSON Support - Points, Lines, Polygons
- ✅ GPX Processing - Track/Route parsing
- ✅ Distance Queries - Radius search
- ✅ Relational Schema - Geo tables integration
Status: Production-Ready | Docs: docs/observability/prometheus_metrics.md
Prometheus Metrics:
- ✅
vccdb_requests_total(counter) - ✅
vccdb_errors_total(counter) - ✅
vccdb_qps(gauge) - ✅
rocksdb_block_cache_usage_bytes(gauge) - ✅
rocksdb_estimate_num_keys(gauge) - ✅
vccdb_page_fetch_time_ms_*(histogram)
RocksDB Statistics:
- Block cache hit/miss rates
- Compaction metrics
- Memtable sizes
- Files per level (L0-L6)
API:
GET /stats # JSON format
GET /metrics # Prometheus formatStatus: Production-Ready
Features:
- ✅ Distributed tracing
- ✅ Span context propagation
- ✅ Performance bottleneck detection
- ✅ OTLP exporter integration
Status: Production-Ready
spdlog Integration:
- ✅ Structured logging
- ✅ Log levels (TRACE, DEBUG, INFO, WARN, ERROR)
- ✅ File rotation
- ✅ Console + file outputs
Status: Production-Ready | Docs: docs/guides/deployment.md
Binary:
themis_server --config /etc/themis/config.yamlDocker:
docker run -p 8765:8765 \
-v /data:/data \
ghcr.io/makr-code/themis:latestDocker Compose:
docker compose up --buildConfiguration Formats:
- ✅ YAML (recommended)
- ✅ JSON
- ✅ Environment variables
Status: Production-Ready
Registries:
- ✅ GHCR:
ghcr.io/makr-code/themis - ✅ Docker Hub:
themisdb/themis(optional)
Tags:
-
latest- Latest stable -
g<shortsha>- Git commit -
latest-x64-linux,latest-arm64-linux- Arch-specific
Multi-Arch:
- ✅ x86_64 (AMD64)
- ✅ ARM64 (aarch64)
Status: Production-Ready | Docs: docs/guides/deployment.md
Features:
- ✅ RocksDB Checkpoints - Consistent snapshots
- ✅ Point-in-Time Recovery - WAL archiving
- ✅ Incremental Backups - Scripted automation
- ✅ API Endpoint:
POST /admin/backup
Scripts:
-
scripts/backup.sh(Linux) -
scripts/backup.ps1(Windows)
Status: Production-Ready | Docs: docs/admin_tools/user_guide.md
Tools (7):
- ✅ Audit Log Viewer - Search, filter, export logs
- ✅ SAGA Verifier - Distributed transaction consistency
- ✅ PII Manager - GDPR data subject requests
- ✅ Key Rotation Dashboard - LEK/KEK/DEK management
- ✅ Retention Manager - Policy-based archival
- ✅ Classification Dashboard - Data classification testing
- ✅ Compliance Reports - Automated reporting
Common Features:
- Unified Themis Design System
- Dark/Light theme
- Export (CSV, PDF, Excel)
- Real-time search & filtering
- Error handling & validation
Publish:
.\publish-all.ps1 # Build all tools to dist/Status: Production-Ready | Docs: docs/plugins/PLUGIN_MIGRATION.md
Unified Interface:
- ✅
IPlugin- Base interface - ✅
PluginManager- Discovery & loading - ✅ Security verification (signature checking)
- ✅ Hot-reload support
Plugin Categories:
- ✅ Blob Storage - Filesystem, WebDAV, S3, Azure
- ✅ Compute - CUDA, Vulkan, DirectX
- 📋 Importers - PostgreSQL, MySQL, CSV
- 📋 Embeddings - Sentence-BERT, OpenAI, CLIP
- 📋 HSM - PKCS#11, Luna, CloudHSM
Benefits:
- Modular binaries (Core < 50 MB)
- On-demand loading
- Third-party extensions
- Reduced dependencies
Status: Production-Ready
Overall Coverage: 85%+
Test Suites:
- ✅ Unit Tests - Core components (269 files tested)
- ✅ Integration Tests - API endpoints, workflows
- ✅ Performance Tests - Benchmarks (Google Benchmark)
- ✅ Security Tests - Encryption, audit, HSM
Test Frameworks:
- Google Test (C++)
- Catch2 (alternative)
- Custom test harnesses
Status: Production-Ready | Docs: docs/development/code_audit_mockups_stubs.md
Static Analysis:
- ✅ clang-tidy - Modern C++ best practices
- ✅ cppcheck - Additional quality checks
- ✅ Gitleaks - Secret scanning
Formatting:
- ✅ clang-format - Consistent style
- ✅
.clang-formatconfig (C++20, 4 spaces)
CI/CD:
- ✅ GitHub Actions (Linux + Windows)
- ✅ Coverage reporting
- ✅ Security scanning
Scripts:
./scripts/run_clang_quality_wsl.sh # Linux/WSL
.\scripts\run_clang_quality.ps1 # WindowsStatus: Comprehensive | Location: docs/
Main Docs:
- ✅ GitHub Pages: https://makr-code.github.io/ThemisDB/
- ✅ Wiki: https://github.com/makr-code/ThemisDB/wiki
- ✅ Print View: PDF export available
- ✅ MkDocs: Local preview support
Categories:
- Architecture - Design docs (base_entity, mvcc, content pipeline)
- Features - Feature guides (32+ docs)
- Security - Security architecture (10+ docs)
- APIs - API references (OpenAPI, ContentFS, Hybrid Search)
- Admin Tools - Tool guides & demos
- Performance - Tuning & benchmarks
- Development - Dev guides, audits
Build Docs:
.\build-docs.ps1 # Generate site/
.\sync-wiki.ps1 # Sync to WikiPlatform: Windows 11, i7-12700K, Release build
| Operation | Throughput | Latency (p50) | Latency (p99) |
|---|---|---|---|
| Entity PUT | 45,000 ops/s | 0.02 ms | 0.15 ms |
| Entity GET | 120,000 ops/s | 0.008 ms | 0.05 ms |
| Indexed Query | 8,500 queries/s | 0.12 ms | 0.85 ms |
| Graph Traverse (depth=3) | 3,200 ops/s | 0.31 ms | 1.2 ms |
| Vector ANN (k=10) | 1,800 queries/s | 0.55 ms | 2.1 ms |
| Index Rebuild (100K) | 12,000 entities/s | - | - |
| Algorithm | Write Throughput | Compression Ratio | Use Case |
|---|---|---|---|
| None | 34.5 MB/s | 1.0x | Development only |
| LZ4 | 33.8 MB/s | 2.1x | Default (balanced) |
| ZSTD | 32.3 MB/s | 2.8x | Bottommost (storage) |
Focus: Ecosystem & SDKs
- ✅ v1.0.0 Production Release - Alle P0/P1 Features komplett
- ✅ GPU Acceleration (CUDA/Vulkan) - 10-50x Vector speedup
- ✅ Multi-Tenancy - Complete tenant isolation
- ✅ GraphQL API - Full GraphQL server
- ✅ OLAP Analytics - CUBE, ROLLUP, Window Functions
- 🔧 JavaScript/Python SDK - Production-ready v1.0
- 🔧 Content Processors - PDF, Office support
- 🔧 CI/CD Improvements - Matrix builds, security scanning
Focus: Distributed Systems
- ✅ Distributed Sharding (Phase 1-3) - Auto-Rebalancing with Load Detection
- 📋 Replication - Leader-Follower, Multi-Master
- 📋 Multi-DC Deployment - Geo-distributed clusters
- 📋 Advanced Graph Algorithms - PageRank, Community Detection
- 📋 Streaming Analytics - Complex Event Processing
Focus: Innovation
- 📋 Multi-DC Replication - Geo-distributed
- 📋 Kubernetes Operator - Cloud-native deployment
- 📋 ML Integration - GNNs, in-database training
- 📋 Real-Time Streaming - Complex Event Processing
- 📋 Advanced Analytics - Graph algorithms, forecasting
Siehe auch: ROADMAP.md für Details
Status: 100% Complete
- ✅ ACID Transactions (MVCC)
- ✅ Multi-Model Support (Relational, Graph, Vector, Document)
- ✅ Secondary Indexes (7 types)
- ✅ HNSW Persistence
- ✅ Graph Traversals (BFS, Dijkstra, A*)
- ✅ AQL Query Language
- ✅ Enterprise Security (TLS, RBAC, Encryption, Audit)
- ✅ Observability (Metrics, Tracing, Logging)
- ✅ Backup & Recovery
Current Status: ~85% Production-Ready
- Core Engine: 100%
- Security Stack: 85%
- API Layer: 95%
- Documentation: 90%
- Client SDKs: 40% (Alpha)
- GPU Acceleration: 10% (Planned)
Storage & Performance:
- RocksDB - LSM-Tree storage
- Intel TBB - Parallelization
- Apache Arrow - Columnar analytics
Serialization & Parsing:
- simdjson - High-performance JSON
- VelocyPack - Binary serialization
- msgpack - Alternative serialization
Vector Search:
- HNSWlib - ANN index
- Faiss - GPU-accelerated search (optional)
Networking:
- Boost.Asio - Async I/O
- Boost.Beast - HTTP server
- libcurl - HTTP client (WebDAV, etc.)
Security:
- OpenSSL - TLS, encryption, PKI
- PKCS#11 - HSM integration
Utilities:
- spdlog - Logging
- yaml-cpp - YAML parsing
- nlohmann/json - JSON library
Testing:
- Google Test - Unit tests
- Google Benchmark - Performance tests
Inspired by:
- ArangoDB (Multi-model architecture)
- CozoDB (Hybrid relational-graph-vector)
- Azure Cosmos DB (Multi-model with ARS format)
- RocksDB (LSM-Tree foundation)
- Faiss (Vector search)
Academic Foundations:
- MVCC (PostgreSQL/Oracle design)
- LSM-Tree (Google Bigtable, LevelDB)
- HNSW (Malkov & Yashunin 2018)
Repository: https://github.com/makr-code/ThemisDB
Issues: https://github.com/makr-code/ThemisDB/issues
Discussions: https://github.com/makr-code/ThemisDB/discussions
Wiki: https://github.com/makr-code/ThemisDB/wiki
Documentation:
- Online: https://makr-code.github.io/ThemisDB/
- PDF: https://makr-code.github.io/ThemisDB/themisdb-docs-complete.pdf
MIT License - See LICENSE file for details
Stand: November 2025
Version: 1.0
Letzte Aktualisierung: 21. November 2025
- AQL Overview
- AQL Syntax Reference
- EXPLAIN and PROFILE
- Hybrid Queries
- Pattern Matching
- Subquery Implementation
- Subquery Quick Reference
- Fulltext Release Notes
- Hybrid Search Design
- Fulltext Search API
- Content Search
- Pagination Benchmarks
- Stemming
- Hybrid Fusion API
- Performance Tuning
- Migration Guide
- Storage Overview
- RocksDB Layout
- Geo Schema
- Index Types
- Index Statistics
- Index Backup
- HNSW Persistence
- Vector Index
- Graph Index
- Secondary Index
- Security Overview
- RBAC and Authorization
- TLS Setup
- Certificate Pinning
- Encryption Strategy
- Column Encryption
- Key Management
- Key Rotation
- HSM Integration
- PKI Integration
- eIDAS Signatures
- PII Detection
- PII API
- Threat Model
- Hardening Guide
- Incident Response
- SBOM
- Enterprise Overview
- Scalability Features
- Scalability Strategy
- HTTP Client Pool
- Enterprise Build Guide
- Enterprise Ingestion
- Benchmarks Overview
- Compression Benchmarks
- Compression Strategy
- Memory Tuning
- Hardware Acceleration
- GPU Acceleration Plan
- CUDA Backend
- Vulkan Backend
- Multi-CPU Support
- TBB Integration
- Time Series
- Vector Operations
- Graph Features
- Temporal Graphs
- Path Constraints
- Recursive Queries
- Audit Logging
- Change Data Capture
- Transactions
- Semantic Cache
- Cursor Pagination
- Compliance Features
- GNN Embeddings
- Geo Overview
- Geo Architecture
- 3D Game Acceleration
- Geo Feature Tiering
- G3 Phase 2 Status
- G5 Implementation
- Integration Guide
- Content Architecture
- Content Pipeline
- Content Manager
- JSON Ingestion
- Content Ingestion
- Filesystem API
- Image Processor
- Geo Processor
- Policy Implementation
- Developer Guide
- Implementation Status
- Development Roadmap
- Build Strategy
- Build Acceleration
- Code Quality Guide
- AQL LET Implementation
- Audit API Implementation
- SAGA API Implementation
- PKI eIDAS
- WAL Archiving
- Architecture Overview
- Strategic Overview
- Ecosystem
- MVCC Design
- Base Entity
- Caching Strategy
- Caching Data Structures
- Docker Build
- Docker Status
- Multi-Arch CI/CD
- ARM Build Guide
- ARM Packages
- Raspberry Pi Tuning
- Packaging Guide
- Package Maintainers
- Roadmap
- Changelog
- Database Capabilities
- Implementation Summary
- Sachstandsbericht 2025
- Enterprise Final Report
- Test Report
- Build Success Report
- Integration Analysis
- Source Overview
- API Implementation
- Query Engine
- Storage Layer
- Security Implementation
- CDC Implementation
- Time Series
- Utils and Helpers
Updated: 2025-11-30