The more systems you have, the harder it becomes to keep a watch of them all. When your dynamic infrastructure includes tens of thousands of hosts, containers, and services, you can’t always anticipate where an issue would originate or what impact it would have on your organization. Take for example, a critical order processing that has ground to a halt, when time is at a premium you need to troubleshoot an incident as fast as possible, and the root cause of this issue can often come from an unlikely source.
Without the right monitoring solution in place, it can be difficult to identify specific patterns on why an incident is occurring. While anomaly detection, outlier detection, and composite alerting, enables you to reliably alert on the issue, other incidents such as an increase in latency or a spike in error rates within areas of your application where you haven’t set alerts can result in significant service unavailability. Fortunately, help is on the way!
BMC Helix AIOps Service Insights feature help IT operations teams make sense of the overwhelming data and more precisely identify trends that are difficult to pinpoint. While Root cause isolation uncover events and anomalies associated with a service and provide root cause analysis, Service Insights fits into your existing workflows to make your investigations faster. Service Insights uses a new AI/ML based auto-detection engine that monitors your applications automatically and continuously analyzes data.
With Service Insights the over worked Service Operations engineers can now identify the precise time of day and day of the week when the service performance has degraded. Utilizing ML pattern recognition capability, it is easy to see when the performance degradation started for example it may have started at the same time as a scheduled backup. Service Operations engineers can now take action such as a re-scheduling of the backup, allow them to return to other business critical projects.
The initial Service Insights feature includes visibility into the periodic performance and health of your business services.
When it detects a pattern or trend, it provides a plain language summary of what happened, if the service health has improved or degraded over a given period. It also tells you the state or severity of a service and how long it has been in that critical state and if it needs immediate attention. Service Insights will also show you the health graph of that service to visualize the behavior easily. You can go back in time to discover the behavior, pattern and trends in your data for the last 15 to 30 days.
Natural Language Summary for Situations
Along with providing Service Insights, BMC Helix AIOps also provide Situations. A Situation comprises events associated with a Service for same or different hosts that are aggregated based on their occurrence, message, topology, temporal relationship or a combination of these factors from across infrastructure, application, and network. Situations uses AI/ML-based event processing technique to identify event patterns from hundreds of raw events, filter out noisy events, and automatically groups similar events together.
We have provided a new feature “Situation Summary” with BMC Helix AIOps which gives a human-readable insight based on natural language processing to describe the problem and why it occurred. It helps the service operator or SRE understand the situation context easily and if it needs immediate action based on the underlying cause and severity of the problem.
BMC Helix AIOps Service Insights speeds up your investigation workflows by surfacing parts of your systems and applications that you may not think to consider while exploring data. It builds on BMC’s established machine learning features, such as anomaly detection, noise reduction and predictions that automatically provide clues to help speed up your investigations. To find out more about Service Insights, please check out the BMC Helix AIOps 22.3 release notes and overview video
To learn more about AIOps and how it can help your organization, be sure to listen to the BMC