Identified - After replacing/updating core software components in our resolver stack yesterday, a few locations started experiencing service degradation.
We have identified the root cause and are working on a resolution.
* Marseille Issue Start: 13:30 UTC, Jan 30th * Boston Issue Start: 19:30 UTC, Jan 30th * Sofia Issue Start: 20:40 UTC, Jan 30th --> Sofia Issue Stop: 16:10 UTC, Jan 31st - Taken offline until a server upgrade can be organized. * Lisbon Issue Start: 23:00 UTC, Jan 30th * Warsaw Issue Start: 23:30 UTC, Jan 30th * Manchester Issue Start: 5:00 UTC, Jan 31st
Jan 31, 2025 - 10:59 UTC
Investigating - We've seen evidence of occasional timeouts and slow resolution specific to one Chicago location (ORD). One other chicago location, QORD5, is not affected.
Resolved -
We experienced some packet loss due to one of our systems going down and not triggering an alert. We are investigating the lack of alerting so this does not occur again.
Issue Start: 13:20 UTC, Feb 1st Issue Stop: 9:50 UTC, Feb 2nd
Feb 2, 09:58 UTC
Resolved -
This has been resolved. Some .CH traffic is routing to Frankfurt, but not much. We will aim to bring our ZRH back online as soon as possible.
Issue Start: 17:10 UTC Issue Start: 18:00 UTC
Jan 30, 20:46 UTC
Monitoring -
Our service in our ZRH PoP have been shut down, and traffic is re-routed to ZRH2, Geneva, or Frankfurt.
We are monitoring and investigating.
Jan 30, 18:04 UTC
Investigating -
Some networks in Switzerland which route to our "ZRH" PoP, including Sunrise, currently cannot reach Quad9; the traffic is blackholed.
We have escalated this matter to our network partner as a Priority 1 issue.
This is not related to the Quad9 outages that occurred earlier today.
Resolved -
Quad9 experienced a significant outage in some points of presence today while replacing/updating core software components in our resolver stack.
We repeatedly performed this update on smaller portions of our network in preceding days without experiencing any problems, as part of a normal, incrementing change cycle. However, when we moved to this larger group of systems, there were issues with automatic health checks, which caused withdrawals of service in some locations. The outage duration varied, but some sites were sporadically or entirely unavailable during the window of 12:50 UTC-13:45 UTC. Most Quad9 locations were not in the set, but notable cities that were in the change group: Bogota, Frankfurt, Turin, Miami, Athens. This issue created delays or missed query responses for roughly 12% of our user community.
We apologize for the outage to those users who experienced problems, and we have identified and resolved the root issue so this is not a repeated failure condition.