Twilio Media Streams Overview – Twilio Support

Overview

Twilio Media Streams is a feature that allows you to access the raw audio of your Programmable Voice calls by forking the audio stream in real-time. This stream can be sent to a destination of your choosing using Web sockets. It enables real-time processing of audio data, which can be used for various applications such as real-time transcriptions, voice authentication, sentiment analysis, and speech analytics.

What You Need To Know

Advantages of Twilio Media Streams

Real-time processing: Enables real-time audio processing for applications like transcriptions and sentiment analysis.
Integration Flexibility: You can send audio streams to your own applications or third-party services like Google Speech-to-Text, IBM Watson Speech to Text, Amazon's Transcribe, etc.
Bi-Directional Streaming: Supports streaming audio back to Twilio, enabling use cases like conversational IVR and custom Text-to-Speech integrations.
Event Handling: Supports new event messages such as Mark and Clear, allowing for advanced functionalities like barge-in.

How to use Twilio Media Streams

Fork the Audio Stream: Use the Media Streams API to fork the audio stream of your voice calls.
Send to Destination: Stream the audio to a WebSocket or SIPREC endpoint.
Process the Audio: Use the audio data for real-time applications like transcription or sentiment analysis.

Code Samples

Learn more about How to use it and check out our Github Code samples.

Additional Information