The adaptation of the enterprise software by a large automotive manufacturer had given rise to the need of integrating the application monitoring into existing monitoring systems of the clients, which enabled better monitoring and alerting compared to the built in functionality. A monitoring API feature needed to be developed fast during the piloting phase to secure the contract.
Background of the Project
The software vendor has been selected as a candidate for an enterprise standard solution by one of their largest clients. This would mean growing their installations base and number of active users considerably, which in return grew the number of potential alerts and incidents several fold the monthly average.
The software had good logging built in, but it was designed to by monitored directly within the applications and offered limited notifications. Making it infeasible to continue using the built in functionality as it was designed for troubleshooting after the fact.
The end customer requested to integrate the application’s monitoring into their company’s standard tool, which was monitoring the whole infrastructure of different hardware and applications.
The project started with the analysis of the multi-tiered application’s architecture and understanding the monitoring queries and KPIs in a couple of brainstorming sessions with the development team. After that, further possible risks and gaps were identified leading to the requirements for the needed interfaces and the extension of the built-in functionality to enable the customization of logging level, black and white listing and cool down time on the API side.
The project identified additional gaps and potential risks not covered by the original built-in solution
From there we started creating custom plugins for the open source monitoring tool to enable the direct monitoring and alerting for the vendor’s software. These plugins started with the monitoring of general metrics like uptime and resources usage and were then expanded to include the application specific metrics like messages, jobs, queues and other metrics that would give a better understanding of the health of the running software.
The usage of the existing OSS meant we were able to fast track the development process and utilize the front-end for consistent dashboard. The team was able to focus on the best possible metrics and how to expand them.
The client received a robust and extensive monitoring tool in a very short time. The tool was flexible in setting logging levels, components to monitor and other variables to adjust like cooling time to ensure the alerts were not triggered by preceding alerts. All of this utilizing an extensive front-end.
Application monitoring lead to better insights.
The extensive and dedicated monitoring which was enabled by the OSS helped improve the general understanding of the individual instances and recognize false positives triggered by network issues, timeouts or similar external temporary factors. This enabled the teams on both the vendor’s as the client’s side, to have a better understanding of the application. It also enabled them to see what additional metrics could be added to improve the troubleshooting.
OSS lead to a reusable solution.
As the solution was realized as plugins to existing widely used open source tools, it made it easy for the client to offer the solution to other clients. It was an easy sell, as it integrated in the already existing infrastructure landscapes the clients utilized, while improving the SLA metrics for the vendor.