Remote Monitoring Status UI Specification

From OpenNMS
Jump to navigation Jump to search

Overview

The remote monitor UI currently has a "drinking from the fire hose" model of display. If you get too many monitors, it becomes very unwieldy, and the current algorithm for determining availability is a time-consuming calculation.

The UI needs to be refactored to make it easier to drill down into specific areas/monitors, while showing the most relevant data at a high level.

Use Cases

  • see "high priority" locations first in the list
  • view location status (monitors started, stopped, disconnected)
  • view service status in a location (number of services down, total number of services, number of monitors reporting errors)
  • view application status in a location (number of services down, total number of services, 24-hour availability)
  • be able to "drill down" the map and view status for a subset of locations

Glossary

Monitor 
A single instance of the remote poller, which is tied to a location.
Location 
A single map point that represents one or more monitors. A location can have geolocation information that represents a single place that has common monitoring configuration.
Service 
A unit of functionality monitorable over a network that resides on a single node/interface. These finest grained units that are monitored by the monitors.
Application 
A collection of services defined by the user that work together to provide a business or other high level function the users depend on.

Implementation

It was decided that GWT and Google Maps is the best way to provide a high-level interface to the monitors while giving the ability to arbitrarily drill down into specific areas and view the status of that subset of locations.

  • (C) The Google Maps based Location Monitor UI will consist of a top level search control and then below that there will a 'Navigation/Results Panel' on the left with a list of locations, and a Google Maps view of the locations on the right.
  • (B) Additionally there will be controls in the results panel that allows you to further filter results to a subset of locations based on user criteria. The map will always display the locations that are available in the results panel.
  • (C) When a new search is performed, the map will be zoomed out to show all potential marker locations. If a user then navigates the map, the locations displayed in the results panel will be updated to include only those that are visible on the map.
  • (B) Additional search options will manipulate the map in a way that does not surprise the user.
  • (C) The initial display when first loading the Monitoring Status page will the to display a map that contains of the configured Location Monitors and will also show list in the results panel sorted by 'priority'.
  • (B+) The back and forward buttons should handle back and forward properly without breaking navigation. (Be able to go back to a previous filter search, etc.)
  • (A) Style the GWT app to match the rest of OpenNMS UI.
  • (A) If a location marker is clicked on the map, it will select (and scroll to, if necessary) the location in the results panel associated with it.
  • (C) This new UI will be reached by clicking the "Distributed Status" link on the OpenNMS application home page. If possible, the old distributed status will still be reachable as a direct URL, but by default, the link will go to the new GWT app.

Defining Status

Remote Monitor Status

There are 4 essential statuses that need to be tracked.

  • monitor status
    • started: the monitor is started and has responded to the heartbeat in a timely manner
    • stopped: the monitor has been intentionally stopped by an administrator
    • disconnected: the monitor has not been intentionally stopped by an administrator, but has not responded to the heartbeat in a timely manner
  • location status - the aggregate status of all started/stopped/disconnected monitors
  • location service status - the aggregate status of all services monitored at a location
  • application status - the aggregate status of a configurable selection of services monitored at a location

A service as seen by one monitor can be up, down, or unknown.

  • up: the monitor is started, and the monitor reported that the service was available
  • down: the monitor is started, and the monitor reported that the service was unavailable
  • unknown: the monitor is stopped or disconnected

Remote Monitor Status Details

Markers

green/up: If all started monitors report "up" for all services

yellow/marginal: If all but 1 non-stopped monitors are disconnected
yellow/marginal: If some (but not all) started monitors report "down" for the same service

red/down: If all started monitors report "down" for the same service
red/down: If all started monitors report "down" for all services

blue/unknown: If no monitors are started for a location

Results Panel

  • same coloration as above
  • additionally, the most critical of the above continuum of conditions will be listed
[Icon] RDU Southeast
  All started monitors report down for service 'HTTP'

Info View

Once we've gotten these implemented, we'll determine which information is most useful to have in a marker's info view. At the least, it will contain a link to the detail view for the location.

Application Status Details

Application-Specific Markers

green/up: If all started monitors report "up" for all services in an application

yellow/marginal: if each service in an application is reported "up" by at least one monitor

red/down: if at least one service in an application is reported "down" by all monitors

blue/unknown: If no monitors are started for a location


Results Panel Markers

green/up: if all applications in a location report green/up status

yellow/marginal: if all applications in a location report green/up or yellow/marginal status

red/down: if any application in a location reports red/down status

blue/unknown: if any application in a location reports blue/unknown status

Results Panel Mockup

When an entry in the results panel is not selected, it will show the Results Panel Markers status described above.

[Icon] Human Resources
  All locations are reporting an outage on the following services: HTTP

For a marginal application


[Icon] Human Resources
  At least one location is reporting an outage on the following services: HTTP, Postgres

Results Panel Details

The navigation panel must be able to show the data in the use cases above. Features include:

  • (C) a list of monitors, sorted by tag
  • (C) a control to choose between monitor status and application status
  • (C) clicking a location monitor result opens its info window with details
  • (B) a control to search for name/tag/etc.

Map Details

  • (C) A marker will be placed for each Location Monitor based on it's configured geolocation. If a marker does not have a configured geolocation, it will be placed on OpenNMS World HQ.
  • (C) Each marker will be coloured based on its current monitor or application status, red, yellow, or green, based on whether monitor or application status is chosen in the results panel.
  • (A) The info window for a location monitor will show the application status for each application the monitor is monitoring (from that monitor's perspective)

Mock Up

Remote Monitor Google Map Mockup.png

The left-hand side will be improved to show status and provide some basic controls for sorting/etc.

Tasks

Initial implementation

  • PJSM-20: add support for priorities in the monitoring locations config (preferably as "tags") and sort the locations in the results list by priority
  • PJSM-17: finish making the results panel UI (see: http://www.opennms.org/wiki/Remote_Monitoring_Status_UI_Specification#Results_Panel_Details)
    • list of monitors, sorted by tag
    • toggle to choose between "monitor" status and "application" status
    • clicking a location on the results panel will open and zoom to the info window on the map
  • PJSM-18: load existing remote monitor status on startup
  • PJSM-19: recalculate remote monitor status as monitor events come in
  • PJSM-21: make sure the map zooms to fit the current marker set upon loading the page
  • PJSM-15: handle map manipulation events (zoom, etc.) so that the results panel includes only the subset of locations that are viewable on the map
  • PJSM-22: integrate the app into the main UI (click "Distributed Status" in the home page) -- if possible, the old distributed status will still be reachable as a direct URL, but by default, the link will go to the new GWT app
  • PJSM-16: markers should be coloured as an aggregate status

Stretch Features

  • add controls for filtering results to a subset of locations based on user criteria. If a user changes the criteria, the map will zoom to show the current set of markers that match the query
  • back and forward buttons should handle navigation properly (back goes to previous search, etc.)

Nice-to-Have Features

  • If a location marker is clicked on the map, it will select (and scroll to, if necessary) the location in the results panel associated with it.
  • style the GWT app to match opennms
  • info window for a location monitor should show the application status for each application the monitor is monitoring (from that monitor's perspective)