From 10:28 a.m. to 12:30 p.m. UTC, we were alerted to multiple customer sites and services being down. We immediately began to investigate the root cause. We determined the root cause to be one DB server that was not reachable due to the loss of persistent storage.
Our engineering team continuously worked on finding ways to resolve the issues. The number of down sites started to decrease as a result, and all sites and services were restored by 5:00 p.m. UTC.
We have identified some improvements that will help us detect similar issues in the future.