Voice chat with Zephyr/Mistral and Coqui XTTS

This Space demonstrates how to speak to a chatbot, based solely on open accessible models. It relies on following models : Speech to Text : Whisper-large-v2 as an ASR model, to transcribe recorded audio to text. It is called through a gradio client. LLM Mistral : Mistral-7b-instruct as the chat model. LLM Zephyr : Zephyr-7b-beta as the chat model. GGUF Q5_K_M quantized version used locally via llama_cpp from huggingface.co/TheBloke. Text to Speech : Coqui’s XTTS V2 as a Multilingual TTS model, to generate the chatbot answers. This time, the model is hosted locally.
By using this demo you agree to the terms of the Coqui Public Model License at https://coqui.ai/cpml
Responses generated by chat model should not be assumed correct or taken serious, as this is a demonstration example only
iOS (Iphone/Ipad) devices may not experience voice due to autoplay being disabled on these devices by Vendor