Infrastructure Issue Affecting Customer Sites

Incident Report for Pantheon Operations

Postmortem

On December 6th, Monday at 18:39 UTC a maintenance process caused platform instability. At 19:32 UTC we were alerted to zonal degradation and invoked our zone fail-over procedure. At 04:00 UTC all affected sites recovered. Approximately 350 sites were affected for more than 15 minutes, approximately 30 sites were affected for more than 41 minutes and 4 sites were affected for up to 4 hours. As a result of this incident we’ve identified, and are implementing, additional mitigations to help better prevent and recover from incidents like this in the future.

Posted Dec 10, 2021 - 12:33 PST

Resolved

This incident has been resolved.

Posted Dec 06, 2021 - 22:23 PST

Update

We are continuing to monitor for any further issues.

Posted Dec 06, 2021 - 21:28 PST

Update

We are continuing to monitor for any further issues.

Posted Dec 06, 2021 - 20:00 PST

Update

We are continuing to monitor for any further issues.

Posted Dec 06, 2021 - 19:00 PST

Update

We are continuing to monitor for any further issues.

Posted Dec 06, 2021 - 18:00 PST

Update

We are continuing to monitor for any further issues.

Posted Dec 06, 2021 - 17:00 PST

Update

We are continuing to monitor for any further issues.

Posted Dec 06, 2021 - 16:01 PST

Update

No new updates.

Posted Dec 06, 2021 - 14:44 PST

Monitoring

A fix has been implemented and we are monitoring the results.

Posted Dec 06, 2021 - 13:30 PST

Update

We are continuing to work on a fix for this issue.

Posted Dec 06, 2021 - 13:24 PST

Identified

The issue has been identified and a fix is being implemented.

Posted Dec 06, 2021 - 12:46 PST

Update

We are continuing to investigate this issue.

Posted Dec 06, 2021 - 12:17 PST

Investigating

We are addressing an infrastructure failure that is affecting a small portion of customer sites.

Posted Dec 06, 2021 - 11:49 PST

This incident affected: Customer Sites.