About
As a Data Center Operations Engineer, our team routinely engages in the configuration and management of critical infrastructure components such as RPDU, PDU, ATS, UPS, CRAC, and NetBotz (Environment Monitoring Sensors) devices. On average, each device requires approximately 3-5 minutes for configuring baseline settings and integration into our monitoring system. Subsequently, an additional 5-10 minutes per device is allocated for vendor-specific configurations and potential firmware upgrades, contingent upon the manufacturer, which includes APC, Raritan, ServerTech, Eaton, Emerson, and MGE.
For many years, our team relied on Schneider Electric's Data Center Expert for these tasks. However, due to its inherent limitations, this tool failed to meet our evolving operational needs effectively.
It quickly became evident that a streamlined and efficient approach was imperative. Initially, a rudimentary solution was implemented in the form of a BASH script, capable of dynamically generating configuration files from a text file and pushing them to devices. However, this approach was confined to APC devices and lacked firmware update capabilities.
Recognizing the pressing demand for a vendor-agnostic and feature-rich solution, development efforts commenced in September 2021. During the initial planning phases, we established several key objectives:
-
User-Friendly GUI: To accommodate team members with varying levels of Linux command line experience.
-
Open-Source Frameworks: Development and approvals from IT Legal and ISRM to ensure compliance.
-
ServiceNow Request Integration.
-
LDAP / Active Directory Integration.
-
Extensibility for Future Enhancements.
-
Vendor Agnosticism.
-
Firmware Upgrade Capabilities.
-
Excel Sheet Input Support.
-
Integrated Logging and Email Notifications.
-
OS-Level Server and Service Monitoring, along with key metric tracking.
To address these goals effectively, I opted to create a web application using a LAMP (Linux, Apache, MySQL, PHP) stack on CentOS. Joomla was selected as the web application framework due to its adherence to Minimum Baseline Security requirements stipulated by ISRM, as well as its compliance with IT Legal's licensing agreement criteria, as approved by our Open-Source Software team.
The development process unfolded in several phases:
-
Implementation of LDAP authentication plugin and Joomla source code modifications to facilitate LDAP-based user authentication and RBAC group assignment.
-
Creation of an Excel template for users to input location and device information, including IP address, hostname, and manufacturer.
-
Development of the Joomla component (GUI) to serve as the front-end for creating "Configuration Tasks."
-
Addition of features such as progress tracking, log downloads, and access to the original input file.
-
Backend services were implemented in Python to render the application functional.
-
Configuration methods encompassed various approaches, including the deployment of config.ini and config.csf files, remote command execution, Redfish and other API integrations, web scraping, and browser automation.
-
Integration with Prometheus Push Gateway was established for application metrics collection, coupled with the deployment of the Node Exporter for OS-level monitoring.
-
Grafana dashboards and alerting configurations were set up to enhance observability.
As of the present day, our achievements include the incorporation of Prometheus targets for onboarding and successful configuration, reconfiguration, and firmware upgrades for 15,252 devices. Taking into account the cumulative development effort invested, we have already realized a significant efficiency gain, saving the team approximately 4.54 weeks of continuous manual labor.
Below select screenshots of the tool in action, with sensitive information appropriately redacted.