Degraded QuickSilver Workflows

Incident Report for Pantheon Operations

Postmortem

At 14:58 UTC October 16th, an unexpected disruption occurred to our Quicksilver service, traced back to secret key rotation executed late the previous week as part of regular security maintenance. Although the rotation itself didn't immediately affect production, a subsequent code merge on Monday morning resulted in Quicksilver jobs failing to run. Service was restored at 19:05 UTC. We have identified improvements that will help us detect and prevent similar issues in the future.

Posted Oct 20, 2023 - 11:02 PDT

Resolved

Following continued monitoring, we are pleased to confirm the resolution of the QuickSilver service disruption. Our engineering team has identified and addressed the underlying issue, restoring full functionality to all workflows.

We appreciate your patience throughout this resolution process. For any remaining or new concerns, please feel free to reach out to our support team.
Posted Oct 16, 2023 - 14:01 PDT

Monitoring

Our engineering team has successfully addressed the cause of QuickSilver workflows being degraded. We are now monitoring to ensure continued stability, and we appreciate your patience during this work.

If you have any outstanding issues, please don't hesitate to reach out to support.
Posted Oct 16, 2023 - 12:23 PDT

Identified

Our team has identified the issue and is actively working on a solution. We apologize for any inconvenience caused and appreciate your patience as we resolve this matter.
Posted Oct 16, 2023 - 11:58 PDT

Investigating

We have detected and are investigating QuickSilver workflows not completing. If a site has workflows that rely on QuickSilver, the workflow will show as still in progress despite being completed. Any QuickSilver operations will need to be run manually.

For urgent issues, please contact support by opening a ticket or support chat.
Posted Oct 16, 2023 - 11:34 PDT
This incident affected: Workflow Operations.