A modular, multilingual, and multimodal Retrieval-Augmented Generation (RAG) system tailored for the financial analysis of Public Investment Fund (PIF) annual reports.
The framework builds upon the design principles of M3DocRAG, extending it with domain-specific adaptations for financial document understanding in both Arabic and English.
Watch the full demo on YouTube: PIF-Multimodal-RAG Demo
- Modular FastAPI backend (src/main.py), with Celery for background tasks, like reports indexing into the vector DB.
- Vector database integration (Qdrant) for dense retrieval (src/stores/vectordb/providers/qdrant_provider.py).
- PDF ingestion and caching (assets/pif-annual-reports/).
- Web frontend (webapp/README.md) for interactive analysis.
- Prometheus metrics (src/utils/metrics.py).
- Dockerized deployment (docker/docker-compose.yml).
-
Run Kaggle Notebook:
- This notebook requires your NGROK_AUTHENTICATION and HF TOKEN tokens.
- After running the notebook, copy the generated ngrok URL.
- For Docker deployment (recommended), paste the URL into
docker/env/.env.appasKAGGLE_NGROK_API_URL. For quick local development, put it in your root.envfile.
-
Download the reports in assets/pif-annual-reports/
-
Set Environment Variables:
- Edit
docker/env/.env.app(for Docker) or.env(for local dev).
- Edit
-
Start Services:
docker compose -f docker/docker-compose.yml up -d --buildOpen http://localhost for the UI. The API is proxied at
/api/v1/*.
- src/: Backend source code (src/README.md)
- webapp/: React frontend (webapp/README.md)
- assets/: PDF reports and cached images (assets/README.md)
- docker/: Docker configs (docker/README.md)
- tests/: Unit and integration tests
- src/main.py: FastAPI app entrypoint
- src/celery_app.py: Celery worker setup
- src/controllers/rag_controller.py: RAG orchestration
- src/routes/generation.py: Answer and compare endpoints
- src/models/asset_model.py: Asset DB model
- src/stores/vectordb/providers/qdrant_provider.py: Qdrant integration
See LICENSE.
For more details, see the linked READMEs in each subdirectory.
