MIRSK Voicebot

With MIRSK VoiceBot, self-service is enabled via a telephone dialogue. VoiceBot includes easy integration with any telephony system, and it includes an API for interacting with a text-based chatbot or NLP application. VoiceBot can be used to voice-activate an existing text chat bot,

or it can be used with a new NLP program or a text-based chatbot solution. Application examples range from an advanced IVR to assisting real-time call center agents or providing a full self-service platform.


MIRSK VoiceBot can be integrated with telephone, chatbot or NLP application.

Telephony Integration

MIRSK VoiceBot includes a built-in SIP-based interface that enables easy integration with any SIP-based telephony exchange. The integration can be implemented as a SIP trunk or via a SIP extension. VoiceBot supports call forwarding, ie. The chatbot or NLP application can direct VoiceBot to forward calls to internal extensions or to external telephone numbers.

Chatbot or NLP application integration

The protocol used to interact with MIRSK VoiceBot is WebSocket and the data format is json. The Chatbot / NLP app sends text to VoiceBot, which converts the text to audio and sends it to the caller. The caller answer is recognized by VoiceBot and sent to the chatbot / NLP app.

The chatbot / NLP app can control which language model or grammar VoiceBot is used for each interaction.

MIRSK Voicebot contains the following components:



SIP connecter

The SIP connector is in fact a full-blown telephone exchange system. For this reason, it is very simple to integrate with any SIP based (which in fact means all) telephony systems. In its simplest form, it can register as an extension to the customer telephony system, or it can connect as a SIP trunk, which will often be the best way for scaling purposes.

The central process

The central process (also called the Orchestrator) handles the interactions between the components, receives traffic from both internal and external interfaces, and directs the traffic to the intended destinations.

STT component

The Speech To Text (STT) component is responsible for speech recognition. The speech recognizer typically uses specialized language models (aka grammars), which can be a simple list of the keywords or phrases that are required to be recognized, or it can be a more complex set of rules, such as a date grammar that will always interpret a spoken date in a uniform format that is easy to understand for the text chatbot or NLP application.

TTS component

The Text to Speech (TTS) component is responsible for converting text output from the chatbot or NLP application into audio that can be listened to by the VoiceBotten user. The TTS component is typically an external service such as Google Clouds Text To Speech or Amazon Polly.

Speech recognition is gradually being used to digitize and add value to many functions and can be integrated with most systems.
Contact us if you want to hear more about how Speech Recognition can add more value to your system for the benefit of your customers

Don't be a stranger! Contact us today

Let's Build Something Together