Application endpoint down
Incident Report for Pantheon Operations
Postmortem

On Monday, 20 July, 2020 around 12:50pm UTC, Pantheon platform experienced a lack of compute capacity in the European Datacenter causing some servers in the region to be unresponsive. The incident affected 0.5% of customer sites within that region. 

We addressed the capacity issue and worked with our upstream provider for the EU to increase our quota in order to provision additional spare capacity.

The incident was resolved at 5:01pm UTC after affected customer sites were moved to the new servers.

Our Engineering team is taking actions to prevent a situation where capacity is not available without an upstream provider. We are taking a look at the process in which we evaluate our capacity in non-US regions preventing such an incident from happening again.

Posted Jul 29, 2020 - 16:56 PDT

Resolved
This incident has been resolved.
Posted Jul 20, 2020 - 10:01 PDT
Update
We are continuing to monitor for any further issues.
Posted Jul 20, 2020 - 09:46 PDT
Monitoring
A fix has been implemented and we are monitoring the results.
Posted Jul 20, 2020 - 09:15 PDT
Update
Our engineering team have confirmed this incident affected sites in the EU region only, other regions were not affected.
Posted Jul 20, 2020 - 08:32 PDT
Update
We are continuing to work on a fix for this issue.
Posted Jul 20, 2020 - 08:12 PDT
Identified
The issue has been identified and a fix is being implemented.
Posted Jul 20, 2020 - 07:38 PDT
Update
We are continuing to investigate this issue.
Posted Jul 20, 2020 - 07:20 PDT
Update
We are continuing to investigate this issue.
Posted Jul 20, 2020 - 06:30 PDT
Investigating
We have been alerted to an issue affecting an individual endpoint and are investigating
Posted Jul 20, 2020 - 05:50 PDT
This incident affected: Customer Sites.