Monday, October 11, 2010

Monitoring IP Traffic

Monitoring the performance of the network is essential for an operator to verify that the network is performing as expected and also to identify new configuration changes in order to optimise the network configuration. For example, mobile network operators gather millions of raw statistics 24/7 365 days a year. The statistics are used to calculate Key Performance Indicators (KPIs) that quantify the network performance and to determine an improved planned configuration.

For IP networks, many tools are available for collecting performance data from the network which itself can be problematic. An enterprise may undertake to develop its own bespoke management system that integrates the tools it wants to use to monitor the performance of the network, or purchase off-the shelf tools from a vendor that may have already integrated some 3rd party tools. Integation of the tools is one thing, how you visual the information is another.

Last week, I become aware of one such tool, DangNetworks that uses SNMP and NetFlow that can, according to its website, "Monitor saturated links with SNMP, and find out who's causing it with NetFlow". Through using the two mechanisms together, an network operator can derive more value with what's going on in the network. Afterall what's of primary importance to an operator or enterprise manager is that the services or applications that are of most value to their business are performing as expected.

While the number of monitoring tools available continues to grow, it should be noted that collecting large volumes of network performance data can in itself degrade network performance so for the network operator, monitoring the network is a balancing act.

One aspect of the work that we are undertaking in EFIPSANS is that performance monitoring of the network can be set at a minimal level under normal operating conditions. Through monitoring of certain metrics, the performance level can be autonomously adjusted to gather more data to determine the reason for a performance degradation than can ultimately result in a self-configuration of the network. Normal performance monitoring can be restored once the criteria of normal operating conditions have been met.

We developed a tool for path monitoring, called a Path Management Element, for monitoring the performance of a path. For example, if the PathME detects that the latency for a particular flow has exceeded a predetermined threshold at an endpoint, the PathME can turn on additional monitoring along the path from the source to destination to identify where the latency is being incurred along the path.

The PathME integrates tools such as ping and traceroute to measure the performance of a path. We have also developed an experimental intrinsic monitoring protocol, that uses the Router Alert feature as defined by RFC 2711, that is able to collect performance metrics in-line at each node in a Hop-By-Hop manner as a packet traverses the network along a path. This is currently being integrated into the PathME, as is support for One Way Active Measurement Protocol (OWAMP), defined by RFC 4656, and IP Flow Information Export (IPFIX), defined by RFC 3917.

The PathME uses a common model that defines the managed objects, their attributes and relationships that are supported for a path. The user of the PathME can customise what performance information is required, and how often a measurement report should be generated. The PathME uses a mediation function to abstract the actual tools used to collect the data. The performance report is an XML file that complies with the XML schema that describes the common model.

The intrinisic monitoring aspect of this work will be presented at CNSM 2010 later this month at Niagara Falls, Canada.

No comments:

Post a Comment