Resolved -
While performing a configuration update of our DNS and firewall services in several locations, our Sydney location incorrectly received a configuration change that was not meant for that site, causing significant resolution issues and SERVFAIL responses. This impacted the communication between our recursive resolvers and some authoritative DNS servers.
The resolution issues caused by this erroneous configuration update did not trigger an alarm in our alerting system, so this significantly delayed our awareness of the issue, as it was deployed outside of business hours in Australia, and we did not receive reports initially. We are looking into improving our monitoring and alerting to handle this specific use case.
We have performed a root-cause analysis, and have updated the incorrect database values which caused Sydney to erroneously receive the aforementioned, incorrect configuration changes.
We sincerely apologize for any inconvenience.
Start: 12:10 UTC - August 26 End: 06:45 UTC - August 27
Aug 27, 09:38 UTC
Resolved -
This incident has been resolved.
Aug 27, 09:21 UTC
Identified -
We've implemented "Phase 1" of performance improvements, increased capacity, and mitigation efforts in this Ashburn PoP.
Phase 2 is planned for later this week or early next week.
We are going to monitor during the next traffic spike and provide an update.
We apologize for this ongoing issue and are trying to make the necessary adjustments to mitigate any performance and reliability issues as soon as possible.
Mar 18, 23:01 UTC
Investigating -
We're experiencing large-scale attacks / traffic spikes once or twice daily that are resulting in service degradation (timeouts) during these times.
We apologize for the inconvenience and are working on a plan to eliminate the negative effects of these traffic spikes.
We will update this page as soon as we have made the appropriate changes in Ashburn.
Mar 5, 19:32 UTC