Degraded Dashboard and Terminus Performance
Incident Report for Pantheon Operations
Postmortem

At 16:35 UTC October 16th, we received alerting from internal telemetry regarding Dashboard unavailability and Terminus degradation. This was confirmed externally from customer reporting channels. The issue was resolved at 16:58 UTC via self-healing restart. The Dashboard and Terminus service availability was restored and confirmed healthy both internally and through proactive engagement with customers.

We have identified areas to improve our infrastructure to prevent issues like this from occurring in the future.

Posted Oct 19, 2023 - 10:19 PDT

Resolved
This incident has been resolved. Further work is ongoing to add additional reporting and early detection to prevent similar incidents going forward. We appreciate your patience during this incident and thank you for your understanding.
Posted Oct 16, 2023 - 11:15 PDT
Monitoring
Functionality is restored to both the Dashboard and Terminus, we apologize for the interruption and will continue monitoring to ensure stability going forward.

If you are still running into issues, please don't hesitate to reach out to support.
Posted Oct 16, 2023 - 10:18 PDT
Investigating
We have detected elevated error rates for the Dashboard and Terminus commands. These may manifest as slow page loads, failed logins, or failures with Terminus commands. We are particularly seeing 502 Bad Gateway response from the dashboard.

For urgent issues, please contact support via helpdesk@pantheon.io
Posted Oct 16, 2023 - 09:50 PDT
This incident affected: Dashboard and Terminus Operations.