Can Twilio tell whether a call was answered by a human or machine?

Update on Answering Machine Detection (AMD) Beta

We have received mixed feedback from our customers regarding the performance of Answering Machine Detection (AMD) during the beta process that includes:

  • Speed of detection when human picks up the phone; while our goal is to detect a human in shortest possible time, we have received feedback that it can take several seconds to reliably detect a human, resulting in silence on the line.
  • Inconsistency in detecting voicemail beep causing incomplete or missing voicemail messages.
  • Inconsistency in detecting fax machines.

We have taken this feedback seriously and started working on improving the performance of the AMD, which we hope to make it available for you to test by Q3 2018. While we had hoped to make AMD GA by Q2, we want to make sure that the product we release meets the quality and standards that our customers expect from us. In the interim, you can continue to use AMD if you find its performance to be acceptable to your use case, but there are no adjustments we can make to improve the performance of current state of AMD.


Twilio has an Answering Machine Detection system which can detect if an outbound call made with Twilio's API has reached a human, answering machine or fax.

How AMD works

Answering Machine Detection (called AMD for short) listens to the first few seconds of a call and analyzes the audio. There is no consistent signaling difference between a call picked up by a human or a machine, so Twilio relies on analyzing the sound patterns of the first few seconds of a call.

When a human answers a call, the typical pattern is to say “Hello” and then wait for the other party to respond “Hello”. Basically, you can think of it as sound, followed by silence.

However, the typical pattern for voicemail is to continue speaking and say something like “Hi, you’ve reached so-and-so…”. This is constant sound with no silences. It is this pattern that AMD is listening for. Sound followed by silence means human, constant sound means voicemail.

AMD relies on measuring a greeting against typical speech patterns and is powered by a machine learning algorithm trained on thousands of call samples, the new AMD is tuned for speed and accuracy and offers a 94% accuracy rate across a large sample set of calls from the US and Canada.

How to use AMD

To use AMD, add the MachineDetection parameter to the POST request when you make an outgoing call.

MachineDetection has two possible values:

  • "Enable". Enable returns results as soon as recognition is complete.
  • "DetectMessageEnd".DetectMessageEnd will wait until after a greeting to return results if an answering machine is detected.

Here is a sample POST request that places a call, and will wait until it hears a "beep" before continuing with it's call flow.

curl 'https://api.twilio.com/2010-04-01/Accounts/ACXXXXXXXXXXXXXXXX123456789/Calls.json' -X POST \
--data-urlencode 'To=+1562300000' \
--data-urlencode 'From=+18180000000' \
--data-urlencode 'MachineDetection=Enable' \
--data-urlencode 'Url=https://handler.twilio.com/twiml/EH8ccdbd7f0b8fe34357da8ce87ebe5a16' \
-u ACXXXXXXXXXXXXXXXX123456789:[AuthToken]

When an outbound call is made with the MachineDetection parameter as part of the request, Twilio's request to your application to retrieve your TwiML will contain an additional parameter called AnsweredBy.

AnsweredBy can have a few values. If Enable was specified, results can be: machine_start, human, fax, unknown. If DetectMessageEnd was specified, results can be: machine_end_beep, machine_end_silence, machine_end_other, human, fax, unknown.

You can use these values to dictate the behavior of your app, like so:

<?php
header('Content-type: text/xml'); echo '<!--?xml version="1.0" encoding="UTF-8"?-->'; echo '';
switch($_POST['AnsweredBy']){ case 'machine_start': echo "We're sorry you're not there. We will leave a message."; break; case 'human': echo "So nice to talk to a real human!"; break; } echo '';
?>

Alternatives to AMD

AMD is not suited to all use cases, but there are alternatives which may suit your application even if AMD doesn't.

"Human Detection"

One alternative to AMD is Call Screening, aka "Human Detection".

This approach works by asking a human to respond by pressing a key, and assumes that an voicemail has been reached if a key isn't pressed. In that case, you can have your app retry the call later to hopefully reach a human on the second try.

"Human Detection" is very reliable if you want to only deliver your message to humans. However, if you want to leave a voicemail, we do not recommend using "Human Detection". This is because voicemail greetings are not a consistent length, so you are more likely to have your call flow begin before the voicemail begins recording than you are with AMD.

Here is a quick example of call screening:

index.xml

<?xml version="1.0" encoding="UTF-8"?>
<Response>
    <Gather action="message.xml" method="get">
        <Say>Press any key to hear an important message about your appointment.</Say>
    </Gather>
</Response>

message.xml

<?xml version="1.0" encoding="UTF-8"?>
<Response>
    <Say>This is a message that you only want a person to hear.</Say>
</Response>

Looping the Message

If your message is short, for example "This is a message from Owl Elementary School. School is cancelled on account of snow.", you might just want to loop the message. This ensures that the message gets delivered, regardless of whether the message is answered by a human or a machine. You can do this by adding the "loop" attribute to the verb.

Have more questions? Submit a request
Powered by Zendesk