-
Notifications
You must be signed in to change notification settings - Fork 1.5k
Open
Description
Problem Statement
Pipecat’s current TTS services only support JSON-based WebSocket protocols. The Arcana (Rime) TTS model streams raw binary audio frames instead of JSON messages, so the existing TTS service infrastructure cannot process Arcana output. This prevents users from using Arcana for real-time speech generation within the Pipecat pipeline.
Proposed Solution
Proposed Solution
Implement a new RimeArcanaTTSService that adapts Pipecat’s TTS interface to Arcana’s non-JSON WebSocket protocol.
The service should:
- Establish a WebSocket connection using Arcana’s query-parameter configuration.
- Send plain text input (no JSON envelopes).
- Receive binary audio frames and forward them directly to the audio pipeline.
- Support Arcana control commands:
<EOS>,<CLEAR>,<FLUSH>. - Implement the required lifecycle and utility methods:
init,start,stop,_connect,_disconnect,_receive_messages,_get_websocket,
run_tts,_handle_interruption,flush_audio,language_to_service_language,
can_generate_metrics,_update_settings. - Reconnect when settings change (since model and voice parameters affect the WebSocket URL).
This enables full Arcana compatibility, consistent service behavior, and real-time speech synthesis support across Pipecat.
Alternative Solutions
No response
Additional Context
No response
Would you be willing to help implement this feature?
- Yes, I'd like to contribute
- No, I'm just suggesting
Metadata
Metadata
Assignees
Labels
No labels