Overview
Twilio Media Streams is a feature that allows you to access the raw audio of your Programmable Voice calls by forking the audio stream in real-time. This stream can be sent to a destination of your choosing using Web sockets. It enables real-time processing of audio data, which can be used for various applications such as real-time transcriptions, voice authentication, sentiment analysis, and speech analytics.
What You Need To Know
Advantages of Twilio Media Streams
- Real-time processing: Enables real-time audio processing for applications like transcriptions and sentiment analysis.
- Integration Flexibility: You can send audio streams to your own applications or third-party services like Google Speech-to-Text, IBM Watson Speech to Text, Amazon's Transcribe, etc.
- Bi-Directional Streaming: Supports streaming audio back to Twilio, enabling use cases like conversational IVR and custom Text-to-Speech integrations.
-
Event Handling: Supports new event messages such as Mark and Clear, allowing for advanced functionalities like barge-in.
How to use Twilio Media Streams
- Fork the Audio Stream: Use the Media Streams API to fork the audio stream of your voice calls.
- Send to Destination: Stream the audio to a WebSocket or SIPREC endpoint.
-
Process the Audio: Use the audio data for real-time applications like transcription or sentiment analysis.
Code Samples
Learn more about How to use it and check out our Github Code samples.
Additional Information