Atlas Cluster Scheduled Operations Delays

Incident Report for MongoDB Cloud

Resolved

This incident has been resolved.

Posted Oct 20, 2023 - 11:19 UTC

Update

An invalid backend configuration caused system maintenance to be queued for Atlas clusters at much faster rate than expected, which overloaded the Atlas backend's ability to keep up with the requested system and user operations.

The configuration was corrected and system maintenance operations were deprioritized in favor of use operations, but it still took time for all the operations in the queue to complete and latencies on progressing requests to return to normal.

As of now latencies are back to normal.

Posted Oct 20, 2023 - 11:18 UTC

Update

The system continues to recover. However, we are seeing remaining delays in some user-driven / requested change operations as well.

Posted Oct 19, 2023 - 23:44 UTC

Monitoring

The issue has been identified, and a fix has been implemented. Some users see system maintenance taking longer than normal.

Posted Oct 19, 2023 - 21:13 UTC

Investigating

We are experiencing an issue that may result in delays to some scheduled cluster operations such as backups and planned maintenance.

User-driven / requested change operations are not impacted.

Posted Oct 19, 2023 - 17:25 UTC

This incident affected: MongoDB Cloud.