DevProjects/Drift

From OpenNMS
Jump to: navigation, search

Introduction

OpenNMS Drift is an OpenNMS sponsored project and the goal is to support Streaming Telemetry and Forensics via Flows.

Architecture Overview

Drift Architecture Overview.jpg

Components

Flow Collector

The Flow Collector listens on a UDP or TCP port for Flow Packages, parses the incoming data and enriches it with OpenNMS knowledge (e.g. location, exporter address, categories, tags, etc.) and persists it afterward to the Flow Persistence Storage.

Requirements

  • listen for Netflow 5,9, IPFIX and sFlow packages and parse them accordingly
  • Enrich the protocol data with useful OpenNMS knowledge, e.g.
    • The location the NetFlow package is coming from
    • The address of the exporter
    • Node ID
  • Save the data in the Flow Persistence Storage
  • Initially the collector should be able to collect and persist ~3000 flow packages or ~90000 flows per second. In order to achieve this, the FlowCollector should be independently deployable and scalable from OpenNMS.

Details

The detailed Architecture of the FlowCollector is as follows:

Flow Collector Details.jpg

Flow Exporter

The Flow Exporter exports Flow Packages via UDP or TCP in various formats (e.g. Netflow 5, IPFIX). This component does not need to be implemented.

FlowParser

The Flow Parser is responsible for listening and parsing incoming UDP/TCP Flow Packages and allow other components of the Flow Collector to work with the data.

The Flow Parser should be able to parse the following protocols:

Some Flow Packages define Flow Templates which need to be cached, as well as in some cases it may be required to cache Flow Packages before parsing if a Flow Template has not yet been received. See the according to specifications above.

The Flow Parser should also add the IP Address from the Flow Exporter, as this may not be part of the Flow Package itself.

It may be suitable to run the Flow Parser on a Minion.

Flow Enricher

The Flow Enricher is responsible to add relevant meta-data to the received Flow Package. To achieve this, the Flow Enricher may need access to a running OpenNMS.

Enriched data may be:

  • The location the Flow Package was exported from
  • The node the Flows may be associated with
  • A service a Flow may be associated with (e.g. to allow port to service mappings)
  • etc.

Flow Writer

The FlowWriter is responsible to convert the enriched Flow Package into a Flow Document. Each Flow Document represents one concrete Flow and therefore one Flow Package may result in multiple Flow Documents. The Flow Document may be enriched with further information, such as the protocol (e.g. Netflow, sFlow) and the vendor (cisco, juniper, etc). It is afterwards persisted to the Flow Persistence Storage.

With IPFIX and Netflow 9, there are a lot of fields which could be persisted. It may be useful to have some kind of configuration to determine which fields are persisted and which fields are ignored.

Flow Persistence Storage

The Flow Persistence Storage is responsible for storing Flow Documents received from the Flow Writer.

It is proposed to use ElasticSearch as the Flow Persistence Storage.

Requirements

  • Be able to persist ~3000 flow packages or ~90000 flows per second (with room to grow, should be scalable)
  • Allow running complex queries to analyze already persisted data

Flow API

The Flow API is a Rest API provided by OpenNMS. It allows querying the data from the Flow Persistence Storage and may prepare the data for easier usage, e.g. in the Flow UI.

Requirements

  • May work as a proxy to the Flow Persistence Storage
  • May allow querying Flow Documents and prepare them for easier usage in the Flow UI (e.g. Sankey diagram)
  • Provide additional functionality for Forensic Analysis, etc.

Flow UI

The Flow UI makes use of the Flow API to show the persisted Flow Documents in an aggregated fashion to the user.

The Flow UI is to be interpreted as an abstract term. There probably will be no concrete "Flow UI" but multiple UIs making use of the Flow API.

Requirements

3rd Party Tools

3rd Party Tools such as Kibana or Grafana may be used to interact with the Flow Persistence Storage or the Flow API to further work with the collected data.

Performance Testing

We've built a full-stack solution on top of Kubernetes that can be used to deploy a test environment: https://github.com/j-white/drift-e2e

Running on top of GCP, an Elasticsearch cluster with the following specifications was able to handle indexing of 100k flows/second:

  • 3 * Master Nodes
    • 2GB Heap
    • 1 vCPU
  • 4 * Client Nodes
    • 8GB Heap
    • 2 vCPU
  • 9 * Data Nodes
    • 16GB Heap
    • 4 vCPU
    • 80GB SSD
    • Where IOPS = Min(IOPS Per GB * Number of GBs, 30000)
      • = Min(30 * 80, 30000)
      • = Min(2400, 30000)
      • = 2400 IOPS
[jesse@noise ~]$ kubectl -n $(gizmo-ns) get pods    
NAME                         READY     STATUS    RESTARTS   AGE                                          
es-client-3042550706-2fdfx   1/1       Running   0          42m                                          
es-client-3042550706-k5p74   1/1       Running   0          42m                                          
es-client-3042550706-v0wmd   1/1       Running   0          42m                                          
es-client-3042550706-vlxm7   1/1       Running   0          42m                                          
es-data-0                    1/1       Running   0          42m                                          
es-data-1                    1/1       Running   0          41m                                          
es-data-2                    1/1       Running   0          41m                                          
es-data-3                    1/1       Running   0          40m                                          
es-data-4                    1/1       Running   0          40m                                          
es-data-5                    1/1       Running   0          40m                                          
es-data-6                    1/1       Running   0          39m                                          
es-data-7                    1/1       Running   0          39m                                          
es-data-8                    1/1       Running   0          38m                                          
es-master-3104414070-02t0n   1/1       Running   0          42m                                          
es-master-3104414070-0xpv1   1/1       Running   0          42m                                          
es-master-3104414070-p63c0   1/1       Running   0          42m                 
curl http://elasticsearch:9200/_cat/shards
flow-2017-11-21-14 13 p STARTED 6879780 525.5mb 10.8.0.7 es-data-7
flow-2017-11-21-14 13 r STARTED 6879780 525.2mb 10.8.2.7 es-data-2
flow-2017-11-21-14 15 r STARTED 6881308 521.5mb 10.8.3.8 es-data-3
flow-2017-11-21-14 15 p STARTED 6881308 526.3mb 10.8.1.8 es-data-6
flow-2017-11-21-14 4  r STARTED 6880905 522.8mb 10.8.3.8 es-data-3
flow-2017-11-21-14 4  p STARTED 6880905 524.9mb 10.8.0.7 es-data-7
flow-2017-11-21-14 10 p STARTED 6880981 530.3mb 10.8.5.9 es-data-8
flow-2017-11-21-14 10 r STARTED 6880981 526.7mb 10.8.1.8 es-data-6
flow-2017-11-21-14 12 p STARTED 6881542 524.6mb 10.8.0.6 es-data-1
flow-2017-11-21-14 12 r STARTED 6881542 523.3mb 10.8.0.7 es-data-7
flow-2017-11-21-14 5  r STARTED 6879653 528.1mb 10.8.5.9 es-data-8
flow-2017-11-21-14 5  p STARTED 6879653 527.3mb 10.8.2.7 es-data-2
flow-2017-11-21-14 1  p STARTED 6880033 527.3mb 10.8.5.9 es-data-8
flow-2017-11-21-14 1  r STARTED 6880033 526.8mb 10.8.5.8 es-data-4
flow-2017-11-21-14 7  r STARTED 6878374 524.2mb 10.8.0.6 es-data-1
flow-2017-11-21-14 7  p STARTED 6878374 529.1mb 10.8.1.7 es-data-0
flow-2017-11-21-14 6  r STARTED 6883309 521.8mb 10.8.4.8 es-data-5
flow-2017-11-21-14 6  p STARTED 6883309 526.7mb 10.8.1.8 es-data-6
flow-2017-11-21-14 2  p STARTED 6880868   527mb 10.8.4.8 es-data-5
flow-2017-11-21-14 2  r STARTED 6880868 528.5mb 10.8.1.7 es-data-0
flow-2017-11-21-14 3  p STARTED 6882940 526.6mb 10.8.0.6 es-data-1
flow-2017-11-21-14 3  r STARTED 6882940 521.4mb 10.8.5.8 es-data-4
flow-2017-11-21-14 8  r STARTED 6880665   526mb 10.8.0.7 es-data-7
flow-2017-11-21-14 8  p STARTED 6880665   529mb 10.8.5.8 es-data-4
flow-2017-11-21-14 11 p STARTED 6883804 526.9mb 10.8.4.8 es-data-5
flow-2017-11-21-14 11 r STARTED 6883804 526.7mb 10.8.0.6 es-data-1
flow-2017-11-21-14 9  p STARTED 6885797 528.5mb 10.8.3.8 es-data-3
flow-2017-11-21-14 9  r STARTED 6885797 529.2mb 10.8.2.7 es-data-2
flow-2017-11-21-14 14 p STARTED 6883259 526.6mb 10.8.2.7 es-data-2
flow-2017-11-21-14 14 r STARTED 6883259 528.8mb 10.8.1.8 es-data-6
flow-2017-11-21-14 0  p STARTED 6875882 527.2mb 10.8.3.8 es-data-3
flow-2017-11-21-14 0  r STARTED 6875882 528.9mb 10.8.1.7 es-data-0

Using these numbers to estimate the storage requirements, we end up with roughly 80 bytes per flow.

With a rate of 100k flows per second, and 1 replica, this requires 100000*80*2 = 15.25 MB/s.

For a 30 day retention period, this is roughly equivalent to 40TB.

Quick Start

(Note, these are just my notes of how I got a working Drift setup. YMMV) - RangerRick (talk) 20:26, 8 March 2018 (UTC)

  1. Check out and build the Elasticsearch Drift plugin from the features/es622 branch (with "mvn install")
  2. Install Elasticsearch 6.2.2
  3. Install the Drift plugin into Elasticsearch. Note that the plugin path is a URL:
    bin/elasticsearch-plugin install file:///path/to/elasticsearch-drift/plugin/target/releases/elasticsearch-drift-plugin-1.0.0-SNAPSHOT.zip
  4. Start Elasticsearch ("bin/elasticsearch")
  5. Build (or install another way) the Drift OpenNMS branch
  6. Edit $OPENNMS_HOME/etc/telemetryd-configuration.xml and enable the listeners you want (Netflow 5 for me) and start OpenNMS
  7. Generate fake telemetry data `docker run -it --rm networkstatic/nflow-generator -t <opennms-ip> -p 8877` (or get real telemetry data from a device ;))
  8. Install Grafana 4.6.x
  9. Check out and build the opennms-helm Grafana plugin from the ranger/drift branch (npm run build)
  10. Symlink the opennms-helm directory into Grafana's plugin directory ("ln -s /path/to/opennms-helm /var/lib/grafana/plugins/opennms-helm-app")

Work in Progress