With the U.S. retail holidays Black Friday and Cyber Monday along with the 2022 Midterm Elections, Twilio anticipates a significant spike in short code, 10DLC, and toll-free SMS and MMS messaging, starting Tuesday, November 1 through Monday, November 7 (elections) and again Tuesday, November 22 through Tuesday, November 28 (cyber week). Specifically, due to supply constraints on the MMS network in the United States and Canada, customers that send large bursts of MMS traffic may experience additional delays in deliverability, particularly from toll-free numbers.
In the event of downstream congestion among network carriers, it is anticipated that short code, 10DLC, and toll-free MMS traffic may queue for several minutes longer compared to off-peak sending times . If your MMS traffic is queuing for approximately 20 minutes or more, please follow status.twilio.com for the latest updates on significant degradation. Any temporary measures that Twilio may take in response to traffic spikes and capacity constraints will also be available on the Status Page.
This guide covers the following topics:
- Queue mitigation strategies
- Does throughput apply to entire messages, or message segments?
- If I have subaccounts, do account-based rate limits apply at the subaccount level?
- What happens if I exceed my configured throughput?
- Is this throughput guaranteed?
- If my account is subject to account-based rate limiting, will my 10DLC traffic still be compliant with A2P requirements in the United States?
Queue mitigation strategies
While Twilio will partner closely with the carriers to deliver every message as quickly as possible, there are several strategies that you can leverage during Cyber Week to mitigate increased queueing times.
1. Adhere to your provisioned throughput for SMS and MMS
The most effective strategy for reducing queue length during Cyber Week is to send messages to Twilio at or below your provisioned rate limits, for short code, 10DLC, and toll-free messages. Rate limits, also known as throughput, for SMS/MMS messages are provisioned per-number or at the account-level. The throughput configured on your numbers will dictate the speed at which Twilio will send your messages downstream to the carriers.
By rate limiting traffic on your application to match your configured throughput, you can avoid building up a backlog of messages that are sent to Twilio faster than they can be processed.
2. Use a shorter Validity Period for your time-sensitive messages
If you are using the same number to send messages with mixed use cases, such as one-time passwords and promotional, time-sensitive messages that are sent to Twilio behind a large burst of traffic may functionally expire within the queue.
To prevent messages with a lower latency from becoming stale due to increased queueing times, configure a Validity Period on each message or Messaging Service that reflects the expected delivery window of the message’s use case. For example, an OTP notification should be configured with a low Validity Period of 30 seconds to a few minutes, while a higher Validity Period of 30 minutes to a few hours is more appropriate for promotional traffic that is targeted to a large audience.
The Validity Period is only applicable to the Twilio Platform. If traffic congestion is downstream (carriers) the Validity Period will not be applicable.
3. Implement retries with exponential backoff and continuously monitor API usage.
By implementing retries with exponential backoff, you can improve the deliverability of your messages during spikes in traffic. For Black Friday and Cyber Monday, spikes are most often associated with launching marketing campaigns or business news. When sending messages to a large audience, carefully monitor your usage, via API response headers, to ensure that the number of concurrent requests you are making is at parity with your configured concurrency.
Concurrency is defined as the number of concurrent requests per second that can be made towards all endpoints. If you exceed your concurrency limit, you may start to receive 429 “Too Many Requests” error responses, indicating that you are sending more API requests than what Twilio has configured for your account.
During times of increased downstream provider congestion, combining exponential backoff logic with continuous usage monitoring will allow you to more effectively slow requests down. By throttling requests in your application with exponential backoff, downstream queues can be drained with reduced message loss. For instructions on how to implement exponential backoff, you can review our REST API best practices and detailed tips for avoiding 429 error responses.
Does throughput apply to entire messages, or message segments?
For Twilio’s per-number rate throughput on SMS, throughput is measured in message segments per second (MPS). For MMS, there is no concept of "segments", so MPS is measured in MMS messages per second.
For account-based rate limiting, a single MPS value is configured for each sender type in the SMS and MMS channels. This means that SMS from short code numbers can share a different MPS than SMS from toll-free numbers. With account-based rate limiting, customers are given an MPS based on their volume of messages, no matter how many (or how few) senders they have. Although there are no guarantees for downstream capacity, this adds additional predictability for customers managing mixed use case traffic across several number types.
If I have subaccounts, do account-based limits apply at the subaccount level?
Limits are applied on the parent account level, and include any subaccounts. Subaccounts do not provide any additional throughput.
What happens if I exceed my configured throughput?
If you attempt to send more short code SMS or MMS messages than can be sent out per second based on your configured throughput, the additional messages will be queued in the Twilio platform, and dequeued at the configured MPS. The default maximum queue time is 4 hours. You can specify a shorter queue limit for your messages by setting a different Validity Period value on your Messaging Service or in your API requests.
If you queue up so many messages that you exceed your maximum queue length of 4 hours, the API will start rejecting new requests. If you are not using a Messaging Service, you will observe API errors with HTTP code 429 and error 20429. If you are using a Messaging Service, your message requests will be accepted by the API, but will subsequently fail with Error 30001 - Queue overflow.
For more details on Twilio’s standard sending rate limits and queue behavior, see Understanding Twilio Rate Limits and Message Queues.
Is this throughput guaranteed?
No - we cannot guarantee throughput. There are ecosystem-wide limitations around throughput capacity, meaning demand could outweigh supply at peak send times. However, these limits improve our predictability, and reduce the risk around provider overflow. Understanding the customer limits allows us to plan for and protect our customers' traffic. Because we can predict the maximum volume of messages we can load test high volumes in advance and we can plan in advance with carriers to ensure we have enough capacity.
If my account is subject to account-based rate limiting, will my 10DLC traffic still be compliant with A2P requirements in the United States?
Starting in 2022, A2P 10DLC traffic can be included in account-based rate limiting. However, account-based rate limiting acts as a ceiling for an account, so while it won't affect individual A2P campaign throughputs, it could affect an accounts collective A2P traffic.