Skip to content

Conversation

@jagan-jaya
Copy link
Collaborator

@jagan-jaya jagan-jaya commented Jul 16, 2025

This pull request introduces a comprehensive analytics infrastructure for the Decision Engine project, leveraging Kafka and ClickHouse for real-time event tracking and data aggregation. It also includes updates to configuration files, Docker setup, and database migrations to support the new analytics system.

Analytics Infrastructure Setup

  • analytics/README.md: Added detailed documentation for the analytics system architecture, components, configuration, database schema, and troubleshooting steps. This includes instructions for enabling analytics and querying data.
  • analytics/migrations/001_routing_events.sql: Created a consolidated SQL migration file for setting up the ClickHouse database schema, Kafka integration, and materialized views for real-time event processing.
  • analytics/run_migrations.sh: Added a shell script to automate the execution of analytics migrations in ClickHouse.

Dependency and Configuration Updates

  • Cargo.toml: Added new dependencies (http-body-util, kafka, and clickhouse) required for analytics integration.
  • config/development.toml and config/docker-configuration.toml: Updated configurations to include analytics settings for Kafka and ClickHouse. Enabled analytics in development but disabled it in the Docker configuration by default. [1] [2]

Docker and Service Updates

  • docker-compose.yaml:
    • Renamed container names for clarity (e.g., open-router-local, postgres-db).
    • Added new services for analytics infrastructure: Zookeeper, Kafka, ClickHouse, and an analytics migrator. Defined health checks and profiles for these services. [1] [2] [3] [4] [5]
    • Introduced a new volume for ClickHouse data storage.

Run ./scripts/test_analytics.sh and Change below configs to test directly with application using cargo r

In cargo.toml

[features]
default = ["middleware","postgres"]

In config/development.toml

[analytics]
enabled = true
Screenshot 2025-07-16 at 6 42 30 PM Screenshot 2025-07-16 at 6 42 37 PM

@jagan-jaya jagan-jaya changed the title feat(analytics): add support for analytics in DE feat(analytics): add support for analytics Jul 16, 2025
@jagan-jaya jagan-jaya marked this pull request as ready for review July 16, 2025 13:33
Copilot AI review requested due to automatic review settings July 16, 2025 13:33
@jagan-jaya jagan-jaya self-assigned this Jul 16, 2025
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This pull request introduces a comprehensive analytics infrastructure for the Decision Engine project, integrating Kafka and ClickHouse for real-time event tracking and data aggregation. The implementation provides automated middleware to capture routing events, batch processing for efficient data publishing, and a complete analytics schema for querying routing performance metrics.

Key Changes:

  • Analytics infrastructure with Kafka producer and ClickHouse integration for real-time event processing
  • Analytics middleware that automatically captures events for routing endpoints (/routing/evaluate, /decide-gateway)
  • Docker configuration updates adding Zookeeper, Kafka, ClickHouse, and analytics migrator services

Reviewed Changes

Copilot reviewed 18 out of 19 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
src/analytics/* Core analytics implementation including client, event handling, Kafka producer, and middleware
docker-compose.yaml Added analytics services (Zookeeper, Kafka, ClickHouse) with health checks and profiles
config/*.toml Analytics configuration for Kafka and ClickHouse connections
analytics/ Database migrations, documentation, and testing scripts for analytics setup
Cargo.toml Added dependencies for Kafka, ClickHouse, and HTTP body utilities
Comments suppressed due to low confidence (2)

src/analytics/client.rs:82

  • Using unwrap() on the fallback AnalyticsClient creation could still panic if the disabled config is invalid. The error should be handled more gracefully, possibly by logging and continuing without analytics.
    }

scripts/test_analytics.sh:7

  • The script tests connectivity from the Kafka container to itself using 'kafka' hostname, but the container name is 'open-router-kafka'. This test will likely fail due to hostname resolution issues.
echo "=========================================="

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add framework to collect routing events with kafka and clickhouse installed in docker

3 participants