This project demonstrates different approaches to implementing real-time data integration and transformation in an e-commerce setting, comparing PostgreSQL views, batch computation (cached tables), and Materialize for real-time price updates. It provides real-time visualization of query performance, data freshness, and system behavior.
-
Real-time price comparison across three implementations:
- PostgreSQL View (direct query)
- Batch Computation (cached table)
- Materialize (incremental view maintenance)
-
Performance monitoring:
- Query latency tracking for all implementations
- Query throughput (QPS) measurements
- End-to-end reaction time analysis
- Database size monitoring
- CPU and Memory usage tracking
-
Data freshness tracking:
- Batch computation refresh age
- Materialize replication lag
- Cache rehydration time statistics
-
Interactive scenarios:
- Direct View Queries
- Add Batch Computation
- Add Materialize
- Full Query Offload (CQRS)
-
Visual analytics:
- Real-time price updates with visual feedback
- Query latency charts
- CPU and Memory usage charts
- Comprehensive statistics table
- Interactive RAG response component:
- Real-time latency display
- Dynamic response updates based on OLTP data
- Scenario-aware latency tracking
- Smooth transitions between query modes
The demo showcases three key architectural patterns:
-
Query Offload:
- Offload complex analytical queries from operational database
- Maintain OLTP performance while enabling analytics
- Scale read capacity independently
-
Integration Hub:
- Create a real-time operational data store
- Integrate data from multiple sources
- Enable real-time data products
-
Operational Data Mesh:
- Create decentralized, domain-oriented data products
- Maintain real-time consistency
- Enable domain-driven data architecture
The system tracks several performance metrics:
- Query latency (time to execute the query)
- End-to-end latency (total time including data propagation)
- Queries per second (QPS)
- Statistical aggregates (max, average, p99)
-
CPU Usage:
- Per-container CPU utilization
- Real-time CPU metrics
- Historical CPU trends
-
Memory Usage:
- Container memory consumption
- Database size tracking
- Memory utilization patterns
The application maintains separate connection pools for:
- PostgreSQL: For direct view queries and batch computation
- Materialize: For real-time materialized views
- Graceful degradation on connection issues
- Automatic retry mechanisms
- Visual error feedback in the UI
- Timeout handling for long-running operations
.
├── backend/
│ ├── app/
│ │ ├── main.py # FastAPI application
│ │ └── database.py # Database operations and metrics
│ └── requirements.txt # Python dependencies
├── frontend/
│ ├── src/
│ │ ├── App.jsx # Main React application
│ │ └── components/ # React components
│ └── package.json # Node.js dependencies
└── scripts/ # Setup and utility scripts
⚠️ Hardware Requirements
- CPU: 8+ cores recommended
- RAM: 12GB+ recommended (Make sure docker has access to at least this amount of memory)
For systems with limited resources, you can use the reduced dataset option:
REDUCED_DATA=true docker compose up
The easiest way to run the application is using Docker Compose:
docker compose upThis will start all required services:
- PostgreSQL database
- Materialize instance
- Backend API server
- Frontend development server
Once all services are running, visit http://localhost:5173 to access the application.
This demo supports two industry verticals, each showcasing real-time data integration for a different domain:
A retailer dynamically adjusts pricing based on inventory, demand, and promotions.
A financial institution tracks securities pricing, portfolio values, and capital allocation.
Set the DEMO environment variable before running:
DEMO=freshfund docker compose upor
DEMO=freshmart docker compose upYou will always need to destroy volumes before switching demos.
docker-compose down --volumesSee VERTICAL.md for more details on adding a new vertical.
- PostgreSQL database (14+)
- Materialize instance (v2024.01.1+)
- Python 3.9+
- Node.js 18+
Create a .env file in the backend directory:
# PostgreSQL configuration
DB_HOST=localhost
DB_NAME=postgres
DB_USER=postgres
DB_PASSWORD=postgres
# Materialize configuration
MZ_HOST=localhost
MZ_PORT=6875
MZ_USER=materialize
MZ_PASSWORD=materialize
MZ_NAME=materialize
- Install Python dependencies:
cd backend
pip install -r requirements.txt- Start the FastAPI server:
uvicorn app.main:app --reload- Install Node.js dependencies:
cd frontend
npm install- Start the development server:
npm run dev-
Access the application at http://localhost:5173
-
Choose a scenario to explore:
- Direct View Queries: Baseline implementation
- Add Batch Computation: Introduce caching
- Add Materialize: Enable real-time updates
- Full Query Offload: Complete CQRS pattern
-
Monitor the real-time displays:
- Price comparisons across implementations
- Query performance metrics
- CPU and memory usage
- System statistics
-
Analyze performance through:
- Query latency charts
- Resource utilization graphs
- Statistical summaries
- Reaction time analysis
The demo illustrates the trade-offs between different implementation approaches:
-
PostgreSQL View:
- Always fresh data
- Higher latency
- Direct query overhead
- Resource-intensive for complex queries
-
Batch Computation:
- Low query latency
- Periodic refresh overhead
- Configurable freshness
- Predictable performance
-
Materialize:
- Real-time updates
- Low query latency
- Incremental maintenance
- Efficient resource utilization
These characteristics help in understanding the best approach for different use cases in real-time data integration scenarios.
