Degraded Performance
Incident Report for Pantheon Operations

On June 18, around 3:35 PT, a small set of customer database servers became heavily degraded for about 50 minutes. The underlying cause for the issue (persistent disk latency) is extremely uncommon and that rarity added to the time to recover. About 0.1% of sites on the platform were affected by this incident.

We will be updating our playbook to cover this rare occurrence in more depth, and will be adding additional monitoring for this specific case, which should result in faster remediation should this occur again.

Posted Jun 19, 2020 - 17:10 PDT

This incident has been resolved.
Posted Jun 18, 2020 - 16:57 PDT
We are continuing to monitor for any further issues.
Posted Jun 18, 2020 - 16:42 PDT
The issue has been identified and a fix is being implemented.
Posted Jun 18, 2020 - 16:22 PDT
We are investigating a failed database endpoint.
Posted Jun 18, 2020 - 16:13 PDT
This incident affected: Customer Sites.