Skip to content

Augment the Test Harness for Conducting Workloads #103

@vonjackets

Description

@vonjackets

Epic: Develop a Harness Framework for the Agents, Actor-Oriented Test Execution, and End-to-End Dhall-Defined Evaluations

This epic covers the design and implementation of a cohesive harness system responsible for orchestrating provenance actors, domain-specific consumers, mock infrastructure, and supporting testcontainers, all driven by declarative specifications written in Dhall. The goal is to bridge static configuration with dynamic test execution, such that an end-to-end execution of various workflows can be defined, instantiated, and verified consistently across environments. The harness must interpret Dhall-encoded test plans, stand up required services (including Neo4j, web services, etc.) spawn actors according to their operational topology, route messages through mock or real dependencies, and emit structured provenance events suitable for inspection. Unlike ad hoc integration scripts or opaque BDD frameworks, the harness is expected to provide deterministic, reproducible execution that remains faithful to the semantics of the agent ecosystem.

The core challenge addressed in this epic is designing the contract between Dhall and code. Dhall expresses types, test plans, expected agent topologies, and their associated “scenarios,” while the Rust controller layer must translate those values into running systems. This epic includes defining a Dhall library that models actor types, consumer roles, environment graphs, and test steps with enough precision that they can be evaluated at compile time and converted into runtime instructions for the harness. The epic also requires augmenting the controller responsible for ingesting these Dhall values, provisioning supporting infrastructure, launching actors according to the provenance system’s supervisor trees, routing messages between actors according to the declared test plan, and capturing outputs for verification. This system must become the canonical orchestration engine for provenance-related testing and should replace the scattered scripts, hand-rolled mocks, or one-off harnesses currently in use.

This epic also encompasses the authoring of a supporting Rust library encapsulating interaction patterns with the agents themselves. This includes the standardized lifecycle hooks for actor startup, shutdown, subscription behavior, and mock event injection. It must unify interactions across both real consumers and test fakes. The harness will be responsible for sequencing test scenarios—starting specific actors, injecting events into queues or transports, verifying consumer reactions, shutting down subsets of the topology, and asserting provenance output consistency. The harness must follow a predictable execution flow derived from Dhall: parse configuration, materialize an execution graph, start containers, initialize actors, run scenarios, gather results, and tear down. All of this must operate headless and be composable within CI.

Success for this epic is defined by establishing a full top-to-bottom lifecycle: a Dhall test plan that can be evaluated and validated locally; a harness controller that consumes its compiled JSON; a stable runtime capable of standing up ephemeral test infra; actor orchestration that mirrors the production system as closely as reasonable; and repeatable evaluations producing structured outputs. At completion, engineers should be able to define a test by authoring a Dhall file rather than writing imperative scripts, and the harness should take responsibility for establishing the entire environment and executing the test end-to-end.

Metadata

Metadata

Assignees

No one assigned

    Labels

    EpicdocumentationImprovements or additions to documentationenhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions