Back to entries
Engineering essay / Published
Everything Was Green
Monitoring availability is not the same as monitoring user experience. The incident looked green because the tools watched components instead of paths.
Situation
Users in warehouse operations could not authenticate to the operational portal, but the authentication service, network, and portal all appeared healthy.
Observation
The contradiction mattered: some users worked while others failed. That meant the service was not simply down; the user paths were different.
Lesson
A service is the application plus the dependencies required to reach it. If DNS sends users to the wrong destination, the service is unavailable from their perspective.