You run critical incident management. You are the team responsible for investigating, diagnosing and resolving issues. For establishing proactive monitoring. You are goaled by number of Outages prevented, by Time to Diagnose and Time to Resolve.
When incidents that degrade service occur, you struggle with two aspects - getting the right data and putting this data together to create a coherent picture. You may be missing all the data you need and you don’t realize this till something goes wrong - perhaps there are logs or metrics that were ignored or systems that are not in the main flow of computing - like a DNS server - that are not instrumented. Even when you have all the data you need, putting this together to understand what caused a chain of things to happen, to connect between a symptom being observed [site is slow] and its cause [DB connections not being released] is not easy. The modern cloud-native apps you run are perhaps generating 10x the amount of telemetry of legacy applications.
This is where Intelligeni makes a difference. First with Semantic Topologies and a Computational QoS Model, Intelligeni can actually help you identify whitespaces in your observability picture - you are missing these metrics, or that server seems to be ignored. Second Intelligeni's AIOps models continually looks at all metrics and logs and detects anomalous patterns - you don’t need to define thresholds and rules. Very little gets missed by these models. Third, Intelligeni gathers logs, metrics, traces, configurations from every part of the stack and correlates this information. The Computational QoS Model continually evaluates impact on the QoS of your systems. Is there an increase in usage? Has capacity somewhere dropped? Has a part of the system become more error-prone? Is this cluster partially degraded? These are the kind of behaviors that Intelligeni observes and computes for every single device and system. Intelligeni is essentially looking for any pattern that impacts the QoS of your environment. Proactively.
Observability in Intelligeni is as simple as asking a "What has changed" question to the Intelligeni Bot. You get augmented diagnosis and root cause analysis as a response. Resolve incidents 10x faster.