Degraded Email Deliverability

Incident Report for Pantheon Operations

Postmortem

On September 15th at 13:32 UTC, Pantheon declared an incident due to sporadic reports of outbound email instability. Subsequent investigation revealed performance degradation due to an unusually high volume of email traffic creating a load on the system. After applying performance tuning and restarting related services, normal operations resumed. The system was declared stable at 16:26 UTC and has remained stable since the performance improvements were instituted.

Impact: Between September 14th at 22:03 UTC and September 15th at 16:26 UTC, outbound email messages sent via SMTP experienced delays or were dropped. Pantheon was able to resend delayed emails and emails sent via an API connection were unaffected.

We apologize for any disruption this may have caused to your operations.

Posted Oct 06, 2023 - 10:14 PDT

Resolved

This incident has been resolved.
Posted Sep 15, 2023 - 09:26 PDT

Monitoring

We have identified a bottleneck in our outgoing mail system and performed some tuning to alleviate the email deliverability. We are now monitoring the results.
Posted Sep 15, 2023 - 08:53 PDT

Update

We are continuing to investigate this issue.
Posted Sep 15, 2023 - 08:23 PDT

Update

Our team is actively investigating the disruption in email sending and we understand the impact it may have on your operations. We apologize for any inconvenience caused and want to assure you that resolving this issue is our utmost priority and we are working diligently to rectify it as quickly as possible.

For urgent issues, please contact support via helpdesk@pantheon.io or by opening a support chat.
Posted Sep 15, 2023 - 07:42 PDT

Update

We are continuing to investigate this issue.
Posted Sep 15, 2023 - 07:08 PDT

Investigating

We are currently investigating reports of issues with sending mail from customer sites.

For urgent issues, please contact support via helpdesk@pantheon.io or by opening a support chat.
Posted Sep 15, 2023 - 06:38 PDT
This incident affected: Customer Sites.