Voice chat with Zephyr/Mistral and Coqui XTTS

This Space demonstrates how to speak to a chatbot, based solely on open accessible models. It relies on following models : Speech to Text : Whisper-large-v2 as an ASR model, to transcribe recorded audio to text. It is called through a gradio client. LLM Mistral : Mistral-7b-instruct as the chat model. LLM Zephyr : Zephyr-7b-beta as the chat model. GGUF Q5_K_M quantized version used locally via llama_cpp from huggingface.co/TheBloke. Text to Speech : Coqui’s XTTS V2 as a Multilingual TTS model, to generate the chatbot answers. This time, the model is hosted locally.
Note:
By using this demo you agree to the terms of the Coqui Public Model License at https://coqui.ai/cpml
Responses generated by chat model should not be assumed correct or taken serious, as this is a demonstration example only
iOS (Iphone/Ipad) devices may not experience voice due to autoplay being disabled on these devices by Vendor