Database endpoint down

Incident Report for Pantheon Operations

Postmortem

At 7:18 UTC 13th Sept, 2022, A large number of sites showed downtime. There were about 6 distinct alerts noted between 7:18 to 9:09 UTC. The on-call engineers were alerted but since all sites resolved on their own, no intervention was required from the on-call engineer’s end. We’ve identified this incident as a DOS event. The only impact on the Pantheon Platform due to this event was that the sites were unable to connect with backend services temporarily. We have identified some improvements to detect such events faster in the future.

Posted Sep 26, 2022 - 08:33 PDT

Resolved

This incident has been resolved.
Posted Sep 13, 2022 - 03:30 PDT

Monitoring

A fix has been implemented and we are monitoring the results.
Posted Sep 13, 2022 - 02:48 PDT

Identified

The issue has been identified and a fix is being implemented.
Posted Sep 13, 2022 - 02:23 PDT

Update

We are continuing to investigate this issue.
Posted Sep 13, 2022 - 01:48 PDT

Investigating

We are currently investigating this issue.
Posted Sep 13, 2022 - 01:18 PDT

Update

We are continuing to monitor for any further issues.
Posted Sep 13, 2022 - 01:07 PDT

Monitoring

A fix has been implemented and we are monitoring the results.
Posted Sep 13, 2022 - 01:07 PDT

Identified

The issue has been identified and a fix is being implemented.
Posted Sep 13, 2022 - 01:01 PDT

Update

We are continuing to investigate this issue.
Posted Sep 13, 2022 - 00:54 PDT

Update

We are continuing to investigate this issue.
Posted Sep 13, 2022 - 00:51 PDT

Update

We are continuing to investigate this issue.
Posted Sep 13, 2022 - 00:39 PDT

Investigating

We are investigating a failed database endpoint.
Posted Sep 13, 2022 - 00:38 PDT
This incident affected: Customer Sites.