The Order of the Blue Polo - Member 00009

The Order of the Blue Polo

We are using OpenNMS at Okmetic Oyj as our systems monitoring for 56 nodes, 350 services and 100 interfaces. There are 32 servers, a Vmware server, SAN switches, Aix, HP-UX, SCO Unix and Windows servers and HP and Cisco network switches.

Management server is an age-old Proliant 360 G1 server which works but uptime is >1 almost all the time. O/S is Centos 5.3

We are manufacturing silicon wafers for sensor and semiconductor industry and production is running 7/24. We have plants on both sides of Atlantic Ocean and this makes it quite a challenge to keep servers running and mail flowing all times with only four person staff. OpenNMS is used to collect and monitor system data and receive traps from servers. If needed SMS alerts, Windows popups or emails are sent to system manager to get problems solved.

We started with out-of-the-box configuration which already has basic pinging, snmp collection, thresholding and graphing capabilities for several operating systems but we have also created some own application-specific data collection and alerting.

Last night I got an SMS alerting that an email test message sent from OpenNMS' mail transport monitor to Gmail didn't get through. I found that our spam filter was hanged and fixed it by removing a corrupt message. This level of application-specific monitoring really helps us to get just the important alerts.

Last summer I got complaints about slow internet. I checked OpenNMS' graphs and found that Squid's traffic was abnormally high compared to previous days. I found that we had users transferring videos and got it quickly fixed.

Because we are a small company we can't afford to buy enterprise-level monitoring. Just buying a proprietary monitoring system is not enough. Normally a lot of tailoring is needed, too. And if all needs are not recognized when system is created you are going to buy extra options and consulting later.

- Jarmo Laakso, Okmetic Oyj, Finland