Webview Netflow Reporter

Overview

Webview Netflow Reporter is an enterprise-focused Netflow reporter/analyzer tool featuring clickable graphs, powerful categorization that goes beyond simple TCP/UDP port names, automatic exporter discovery, and full access to all aspects of the raw flow data (interface names, millisecond accuracy, QoS settings, TCP flags, etc). Webview installs on a Linux is accessed via a web browser.

In a typical setup, Netflow data is constantly categorized and tracked on each router/switch interface. The web user selects one or more interfaces and can view a graph of average or/or peak traffic utilization for each category over the past hour, day, week, etc (see sample screenshots below). The user can also run ad hoc reports over any time period to explore all aspects of the traffic, such as a top talkers IP address report or a raw flow chronology.

Download

Download the latest version
Sourceforge.net project site with release details, mailing list, all files, wiki, etc.
Quick installation scripts
- INSTALL.ubuntu tested with Ubuntu 12.0.4
- INSTALL.centos tested with CentOS 6.3

New features in 2013 (version wvnetflow-1.07)

Connection reporting. A "connection" is a complete TCP/UDP session, which may span many flows. Connection reporting clearly identifies the client (initiator) and the server, and also reports on bidirectional packets/bytes transmit/receieved, etc. By looking at a connection as it crosses multiple routers, it is easy to identify packet loss, asymmetric routing, and other oddities. See Ad hoc query tool for more details.
Built-in Symmetric Multiprocessing (SMP). This improves performance on multi-core CPU's when there are many exporters.
SNMPv3 support
Powerful reporting on the health of flow collection, including out-of-order and duplicate exporter packets.
rrdcached is now fully supported for improved disk I/O on very large deployments

Key features

Custom traffic categorization (classification). Traffic categorization uses a Cisco access-list style syntax. Webview comes with several predefined categories that work fine, but the real power of this tool is when you add custom categories that match your network's unique traffic patterns -- Email, ERP, backup, Internet proxies, etc.
Within an ACL, each line can pass or fail based on any combination of:
- IP address / mask
- port number (TCP/UDP)
- TCP flags
- ICMP type and code
- BGP ASN (if configured as an export option)
- IP Type-of-Service (DSCP, IP precedence or a simple number)
- next-hop IP / mask
- exporter IP / mask
- input or output interface
- Number of packets or bytes
- Duration of flow
- Average bytes per second or packets per second
- Average packet size
- The success of another ACL within the past n seconds
Below are several ACL examples.
simple:
complex:
specific:
routing-table derived:
flow-derived:
heuristic:
acl-map:
Categories are grouped together to create an ordered "view" which is then applied to one or more exporting devices. Multiple views can be used on the same device (e.g., it's common to have summary and detailed views of the same data). The number of categories is unlimited, though the quantity and order of categories does impact performance.
For more info on the ACL syntax, data rendering, and other configuration options, please read the PDF documentation.

Graphs and Table reports. Webview uses Flowscan-style "butterfly" graphs showing categories stacked in order of their volume, with input below the axis and output above. E.g.,

This butterfly view may seem confusing at first, but it's a very practical way to display many traffic categories in both inbound/outbound directions. Naturally, you can generate simple single-axis graphs as well, as well as export the raw graph data to csv/Excel for manual manipulation.

Data is aggregated at one-minute resolution, by default. For example, here is the previous graph zoomed in to 8:00 - 10:00 am:

One-minute resolution is ideal for most enterprises since it closely matches the human factor of the end user's tolerance / patience. E.g., most users won't notice or be concerned by occasional short periods of slowness but, if the impact is a minute or more, it's almost guaranteed that they'll be unhappy.

The default aggregation can be lowered to one second and coupled with flexible netflow's one-second active flow timeout to produce amazingly precise graphs. However, the precision comes at a signifcant CPU/disk cost. If your goal is microburst policing, Netflow is really not the right tool for the job.

The graphs above show traffic categories for a router interface. You can also flip this around. E.g., the following graph instead has the single traffic category "TSM" (disk backup) broken out by interfaces named "MPLS/cle" (Cleveland), "MPLS/wau" (Wausau), etc.

In addition to traffic volume, Webview also reports on flows, packets, and concurrently active IPs. For example, you might be interested to know what the average bandwidth need is per active user of a certain application. The following report shows that each user of "Internet_HTTP" needs about ~30 Kbps when measured over a business week (8am - 5pm, Monday - Friday):

Date	Interface	Category	In	Out	In	Out
			Maximum bps		Average bps
Mon Apr 14 2008	LAN/apl	Internet_HTTP	27,049	154,515	3,979	23,660
Tue Apr 15 2008	LAN/apl	Internet_HTTP	62,212	122,788	6,778	18,697
Wed Apr 16 2008	LAN/apl	Internet_HTTP	46,705	235,215	4,974	24,456
Thu Apr 17 2008	LAN/apl	Internet_HTTP	46,584	312,958	5,161	30,668
Fri Apr 18 2008	LAN/apl	Internet_HTTP	23,755	213,472	4,880	45,318

The graphing engine is very flexible (possibly too flexible!):

Clickable graphs. People seem to go ga-ga over this handy feature, and I'm not sure why more commercial packages don't have it. Basically, if you're looking at a graph and see something interesting, you can click on it to find out more.
Leads to this...
This view goes after the raw flow data, which is generally kept for at least 4-8 weeks (depending on available disk space).
Ad hoc query tool. As the previous report shows, there's a nice GUI for exploring and reporting on the raw flow data. It's called the Ad Hoc Query Tool (aka Webview Flow Reporter), and many believe it's the most useful feature of Webview. In reality, it's just a front-end for flow-tools' excellent flow-report utility (and can run standalone without the graphing engine if you like).
The raw flow data available for reporting goes back in time as far as disk space allows. Most small to medium enterprises find that 30 GB is adequate for a month of data. Larger networks can use many terabytes.
Filtering options:
- Protocol: TCP, UDP, ICMP, VPN, Other
- ToS / DSCP / precedence / ECN
- TCP/UDP port number
- IP address or subnet
- User-defined traffic category (using the Cisco ACL syntax described earlier)
- Interfaces (names, descriptions, multiple router interfaces allowed)
- Free-form filters on byte count, packet count, bps, pps, ToS/DSCP, nexthop, tcp_flags
- Logical expressions allowed (OR and AND NOT)
Reporting options:
- Type: Raw - a simple chronological dump of raw flow data. By default, this includes IP addresses, port, protocol, DSCP, TCP flags, start time, duration, and the packet/byte counters.
  - Exporters and interfaces - adds fields showing the exporter and its SNMP interface descriptions
  - Routing - adds nexthop field and routing prefix length
  - ASN - adds BGP ASN fields.
- Type: Connection - passes the flow data through a TCP/UDP connection engine that stitches the flow data back into a bidirectional connection, with the client and server IP's clearly identified.
  - Simple - the basic report displays the client and server IP/port numbers, duration, client-to-server (c2s) and server-to-client (s2c) byte and packet counters.
  - Multihop - the multihop report displays each connection as it is seen by multiple netflow collection points. For example, if a TCP flow goes across a data center router and a branch office router, this will show the connection from both perspectives. This is very handy when looking for packet loss and ensuring end-to-end QoS markings. This is a VERY cool feature!. See an Example report
- Type: Flow - this is the standard flow reporting engine.
  - IP - keys off of every IP (source or destination)
  - Src - keys off the source IP -- this enables the peers counter which shows how many destination IP's each source is sending packets to.
  - Dst - keys off the destination IP -- this enables the peers counter which shows how many source IP's are sending packets to the destination.
  - Port - keys off the source/destination port number.
  - Peers - keys off each source & destination IP pair
  - Flows - keys off each 5-tuple - protocol and source/destination IP&port
  - w/totals - this checkbox will display totals for the entire range, even if only the "top 100" or so items are shown.
- Type: Other - miscellaneous reports
  - Exporters - shows a list of all exporters and the volume of flow data received from each, after applying any of your filters. This also identifies one-way interfaces, which can be useful to identify misconfigured netflow. Note that the exporters report from the webview home page shows more details of a sysadmin nature, but does not have any filtering ability.
  - BGP ASN - generates a flow report by autonomous system numbers (ASN), which aren't often enabled in netflow collection.
Other options:
- Output: Excel, CSV, ASCII, or HTML table
- Input: one or more time-stamped raw flow input file(s). Multiple flow directories are allowed.
- DNS: fast and non-blocking (never delays a report more than 5 seconds).
- Sorting: bytes/packets/flows/peers/duration/chronology. Peers is a count of an IP's connection peers and is great for finding out who is the chattiest -- e.g., malware attempting to spread, DNS servers being hammered, etc.
Note: The usefulness raw netflow data cannot be overstated! Many Netflow software packages aggregate the data or drop fields that don't fit their notion of monitoring. The ad hoc query tool provides a powerful interface to the raw data, the user has unlimited options for investigating this data.

Netflow exporter and interface discovery and normalization. Network management tool maintenance shouldn't be a chore. Adding a new device to Webview is as simple as configuring Netflow export on the device itself. Webview will see it and use SNMP to load the interfaces and descriptions. SNMP ifIndex changes are not a problem, although it's always best to configure Cisco devices with snmp-server ifindex persist. There is no limit to the number of exporters, and some Webview installations have hundreds.

Interfaces can also be aggregated and renamed. For example, say you had four routers:

Rtr1 Ethernet0/0  "Cleveland LAN"
Rtr1 Serial0/0    "Cleveland MPLS T1"

Rtr2 Gig0/0       "Columbus LAN primary"
Rtr2 Gig0/1       "Columbus LAN backup"
Rtr2 Multilink1   "Columbus MPLS 3xT1"

Rtr3 Fast0/0      "Columbus LAN"
Rtr3 Tunnel1      "Columbus MPLS VPN backup"

Rtr4 Fast0/0      "Indianapolis LAN"
Rtr4 Serial0/0.3  "Indianapolis Frame-relay T1"

You could access the graphs by going to a specific router and interface. But, you can also create aliases based on a regular expression. For example, with these aliases:

'(\S+) LAN'						'LAN $1'
'(\S+) (MPLS|Frame-relay|ATM|Direct|Tunnel) (\S+)	'WAN $1'

... the Webview GUI would display these options for graphing:

Alias displayed in the GUI	Interfaces used in reports
LAN Cleveland	Rtr1, Ethernet0/0
LAN Columbus	Rtr2 Gig0/0 and Gig0/1 and Rtr3 Fast0/0
LAN Indianapolis	Rtr4 Fast0/0
WAN Cleveland	Rtr1, Serial0/0
WAN Columbus	Rtr2, Multilink1 and Rtr3, Tunnel1
WAN Indianapolis	Rtr4, Serial0/0.3

The beauty of this approach is that GUI interface is stable and easy-to-navigate, even though the underlying network routers and interfaces may change frequently. It also lets multiple interfaces be easily aggregated together.

Flexible data aggregation. The simplest and most common way to configure Webview is to define traffic categories and track them by router interface. Then, you begin exporting Netflow from one or more routers in your network that you want visibility into. This approach works well for the vast majority of users of this product.
However, for those who don't fit this mold, Webview has full support for processing and visualizing Netflow data:
- by subnet gleaned from the exporting router's routing table
- by subnet / CIDR block defined by hand
- by source/destination IP address
- by source/destination BGP ASN
- by IP next-hop
- according to a ACL
The subnet tracking very useful when Netflow data is not available for a portion of the network. For example, perhaps Netflow is collected only from the head-end routers of an full-mesh MPLS WAN. In this case, tracking Netflow stats by subnet can provide an excellent inferred view of each remote site.
Netflow-embedded timestamps This is a subtle but important point. Many Netflow analysis packages assume that flows are received at the collector in real time and that they each fit within a single sample interval (typically 1, 5, or 15 minutes). Webview can do that, of course. But, if the network is properly configured for Network Time Protocol (NTP), then Webview can instead use the timestamps embedded within each Netflow record. Thus, there are three separate methods for tallying flows:
- tally the flow based on when it was received by the collector
- tally the flow based on the start, end, or middle timestamp of the flow
- distribute the flow evenly over the duration of the flow. This nicely handles flows that are long or straddle a sample interval. E.g., a four minute flow that starts at 07:58:00 and ends at 08:01:59
This last approach has several advantages:
- sample sizes can be smaller than the capture file duration. The Step in webview is the size of each sample interval, and it can be set anywhere from 1 second up to the capture file size (normally 5 minutes). In practice, 60-second samples provide an excellent balance between CPU/storage and being able to see full impact of traffic spikes.
- flows can be delayed. Even in a simple network, a collector won't receive a Netflow packet until several seconds after the flow finished. In a complex network with multiple collectors, it might take minutes or hours for the flow to be gathered at a central point for processing.
- flows can be replayed. E.g., if you change the rendering configuration, it's easy to replay the last month of data through the new configuration.
- an aggressive active flow timeout is unnecessary (i.e., 'ip flow-cache timeout active 1'). In fact, webview can be configured to handle the default timeout of 60 minutes, and the graphs will look just as good as with a 1-minute timeout.
- it guarantees consistent, accurate reporting, even on time-challenged virtual machines. Some VM's seem to have problems with their real-time clocks, leading to "5-minute" flow files that contain anywhere from 3 to 8 minutes of data. That's no problem when the timestamps are used.
Industrial strength flow collection, exporter management, and flow processing
Flow collection
Webview uses a modified version of Damien Miller's flowd (http://www.mindrot.org/projects/flowd/) netflow collector.
- Very high flow rates (tested to over 50,000 flows/second)
- Tested up to a thousand concurrent exporters
- Handles out-of-order netflow packets -- something that no known commercial product does
- Supports multicast listening, which is useful for redundancy
- Compatible with samplicator, a utility that can replicate netflow traffic to multiple collectors.
- Supports v1/5/7/9 netflow data. v9 netflow is supported, but only traditional v5 netflow fields are stored to disk. All other flow fields are discarded. Hopefully this will change soon!
Robust exporter management
- Webview's goal is ZERO maintenance regardless of the number of netflow exporters.
- New exporters are auto-discovered using a list of SNMP v1/v2/v3 parameters (optionally bounded by netmasks).
- Router upgrades? IP address changes? New interfaces? No problem! Webview gracefully handles exporter moves/adds/changes.
- Redundant pairs of router? You can work with them as a single device.
- Comprehensive reporting on the health of all exporters (See sample of an exporter status report showing 600+ exporters):
  - SNMP reachability
  - export version and packet counters, including lost and duplicate packets
  - flow export volume
  - network traffic volume on each exporter interface
  - exporter clock health -- out-of-sync, skewed, or totally fubar.
  - exporter misconfigurations:
    - active flow timeouts
    - one-way interfaces
    - duplicate flows
Soapbox on why managing your flow collection is really important
Here are some real-world examples of Netflow data mismanagement:
These problems are difficult to notice without a good view of collection health. My experience is that commercial Netflow products are weak in this regard. When given out-of-order or duplicate netflow packets, they either don't notice anything is wrong, or they report erroneous findings (billions of drops, for example).
Flow processing
- native SMP (symmetric multiprocessing) for environments with multiple exporters
- processing speeds of 5-50k flows/second per vCPU, depending on configuration complexity
- rrdcached-compatible to optimize disk I/O
- advanced clock synchronization
- Webview has been battle-tested in several demanding multi-national and Fortune 100 environments
Open Source Not only is Webview open-source software (GPL2 license), it also makes use of several open source components in the backend:
- Flow-collection is handled by Damien Miller's flowd (http://www.mindrot.org/projects/flowd/), with modifications. All modifications have been submitted back to the flowd project and will eventually be merged.
- Flow-reporting and filtering is handled by Mark Fullmer's flow-tools (http://www.splintered.net/sw/flow-tools/). Although a bit old, flow-tools still does some things better and faster than any other package.
- A javascript web calendar comes from DHTMLX (http://www.dhtmlx.com/docs/products/dhtmlxCalendar/).
- Perl! Webview Netflow Reporter is almost entirely written in the Perl scripting language. Perl offers great flexibility, rapid development, easy modification, and high performance.

Uses of this application that may not be immediately apparent

Quality-of-Service (QoS). Netflow does not measure jitter or voice quality, but it's excellent for proving that your network's DSCP markings are consistent and that traffic volumes match expectations. For example, here is a graph showing both properly and improperly tagged G.729 traffic:
Ignore the downward spikes (these were voice probes from NetIQ) and focus on the small ~25 Kbps purple and red rectangles between 19:20 to 20:20. These show that a G.729 flow was being marked in one direction but not the other. A click on the purple untagged area reveals the faulty endpoints. A further raw flow report will show the DSCP's:
In this case, one direction of the VoIP conversation is properly marked as DSCP EF, but the other direction isn't.
Netflow reports can also shine a light on jitter/quality measurement from sources like Cisco's IP Service Level Agreement (IP SLA, formerly SAA or RTR). For example, in this QoS validation analysis, the first two graphs show round-trip time (RTT) and jitter for voice and data traffic on a given WAN link as reported by IP SLA. The third graph shows the Netflow traffic utilization. The big Netflow spikes correlate to increased jitter and RTT's on the data traffic, but the VoIP traffic remains immune. The final table shows that the traffic causing the spike is properly being marked as DSCP CS1, which is used for scavenger "less-than-best-effort" service. This analysis took about an hour to do and it proves that VoIP is not impacted by non-VoIP traffic and that network spikes are properly being marked as scavenger. Are you sure that's true in your network?
Note: the /www/ipsla directory of Webview includes an IP SLA monitoring daemon and reporter. If you don't have any other product for IP SLA (e.g., Cacti, NetIQ), then this one works pretty well.
Security forensics. The ad hoc query tool is a great forensic analysis tool. In particular, the raw flow reports are chronologically accurate and can have millisecond accuracy. It's like a sniffer trace, but it's always running and there are no headaches of setting up monitor ports, setting up a collector, and moving around multi-megabyte capture files. Granted, Netflow doesn't show you payload, but it does let you scan through millions of packets in seconds and quickly figure out approximately what happened and when.
A Webview user at an ISP penned the article Mining Netflow on this topic in Information Security, Jan 2006. He uses it daily to root out zombies and malware on his customer's machines. Talk about great ISP service!
Another Webview user had a SQL worm outbreak that evaded their antivirus systems (the worm was brand new but the attack vector was known and unpatched). They only detected the worm when their network started crashing. At that point, sniffers could only tell you who was currently infected. Luckily, Netflow forensics with Webview was able to show that the infection began over 24 hours earlier when a SQL administrator plugged in his laptop infected with W32.Toxbot.B, which then received orders from a rogue machine in the UK to unleash the worm upon the network.
Netflow forensic analysis with Webview can also reveal a lot about Internet usage behavior. ISP's have used it to answer questions from the domestic (a father wondering just what the #@$! his son was downloading) to the felonious (child enticement, kidnapping, threatening "anonymous" emails, etc). Netflow is not CALEA "compliant", but it is a cost-effective way for network operators who are under the CALEA radar to responsibly address these questions if/when they come up.
Profiling WAN acceleration (WAAS, WAFS, etc). Many enterprises are deploying WAN accelerators to speed their traffic along.
Most WAN accelerators tunnel all optimized traffic, which makes it nearly invisible to Netflow on the WAN router. However, the pre-optimized traffic is available from WAN router Netflow if WCCP is used to redirect traffic to the accelerator (example). The pre-optimized traffic may also be available as Netflow from the WAN accelerator itself.
Cisco and modern Riverbed and Expand products all support transparent acceleration which preserves the IP header of each connection (example). In these cases, Webview is able to report on both pre- and post-optimization traffic per category.
Capacity planning.
<soapbox>
It's amazing that many enterprises still use 5 or 15 minute sample intervals and magic numbers like "60%" to manage their WAN capacity. If you love your network users, please set all your bandwidth monitoring tools to use one minute samples I was working for a large retail client whose point-of-sale app seemed to be randomly failing in the stores. The network admin insisted the WAN links were fine. His proof was a graph of 5-minute samples that showed an average utilization of under 25%. We turned on Netflow and, within a few minutes, it becamd clear that the actual traffic volume was 10%, but that a Microsoft replication job pegged the link to 100% for a couple minutes every quarter hour. D'oh!
In the well-run enterprise, capacity planning goes far beyond a macro-level view of traffic in/out an interface. Real capacity planning is about understanding how individual applications behave when faced with debilitating congestion, and that requires a micro-level view that takes QoS policies into account. It's also about understanding difficult-to-control sources of congestion, such as unsolicited Internet traffic, flash crowding on full-mesh MPLS WAN's, and VoIP return-to-service after power outages. And, most importantly, it's about understanding human tolerance and monitoring to that level (e.g., how long until the user clicks the refresh button, restarts their Citrix session, or reboots their PC?).
</soapbox>
That said, Webview does a pretty OK job at capacity planning. It can track long periods of both bandwidth:
and active users:
There is also an optional module called Netusage that extracts business-day data for each element and has a simplified, "manager-friendly" web interface, complete with simple line/pie graphs and easy to read reports. It also stores its data in MySQL, making the data more accessible by tools like Excel and Access.
Tracking IT migrations. One non-obvious use of Webview is to generate reports on IT migration projects. For example, tracking users as they convert from one version of Exchange to another, or from one set of network switches to another.
Webview can help empower you to strut into a migration status meeting with an armful of graphs and reports based on real-world data that show percentage complete, stragglers, etc. The project managers will be amazed and wonder how the network guy was able to get such wonderful data.
Route monitoring. When Netflow is exported directly from a router, it includes subnet information gleaned from the routing tables. Webview includes an optional utility called routeMon which maintains a MySQL table containing the exporter, interface, destination route, and byte count from the previous day.
This information could be tremendously powerful and it's just waiting for the right problem to solve. But, unfortunately, the predominant network management paradigm is to ignore the routing table and instead of focus on managing elements. Thus far, this collected route data has mainly been useful for examining duplicate routes and ensuring that they are intended or not. E.g., in the below report, the duplicate routes are different paths to the same remote office.

Limitations and comparison with other packages

For those comparison-shopping the open source tools, I'll state up-front some of Webview Netflow Reporter limitations:

It cannot easily generate ad hoc graphs because Webview's paradigm is to predefine categories and continually graph them. For example, you can't use the web interface to choose some arbitrary filtering criteria and generate a graph of it over the past week. NFSen and FlowViewer both seem to have this capability. Webview can do it, of course, but it currently requires some behind-the-scenes CLI work by an admin.
Limited support for Netflow v9/Flexible Netflow/IPFIX. The flowd collector supports Netflow v9 packets, but it only saves v5-compatible fields. This is because the flow-tools reporting engine only can work on those fields. So while you can use Netflow v9 and take advantage of its improve configuration, compatibility, and one-second precision, you cannot use any of its extra fields (IPv6 or MPLS tags).
It's not pretty. As the quality of this web page can attest, the author is not a web or UI designer. Please don't be turned off by the gray-on-darker gray theme or the lack of "skins." Although it looks clunky, the interface is fast and functional. And any report can be hyperlinked so you're free to build a more user-friendly portal with webview providing the back-end content.
It's not real-time. Some packages display real-time views of flow being received. Webview is always at least 5 or 10 minutes behind.
It's not designed for those mostly interested in peering (institutions or service providers). It certainly can be used in those environments, but Webview's main focus is the enterprise with its powerful categorization. You may want to check out pmacct or FlowScan instead.
The web interface is probably not secure. It has sanity checks and form validation checks, but the code was not written with security in mind. Specifically, the multitenant capabilities of the web interface might not be bulletproof (tenants may be able to view data outside of their zone). It's not advised to expose it to the Internet or untrusted external parties unless it is front-ended by a portal.
It's not for Windows. Sorry. You can certainly install it on a Linux VM running under the free VMPlayer for Windows.

System Requirements

Webview runs on a physical or virtual Linux host.

If you will be collecting netflow on fewer than 5 routers with a total of less than 200 Mbps of traffic, then sizing should not be a concern. Go ahead and install Webview on a virtual machine with one vCPU, 512MB RAM, and 40-250GB of disk, and see how it goes.

If you will be collecting much more flow data, then you need to consider other factors:

Flow exporter count -- how many devices will be sending flows?
Flow exporter interfaces -- how many total interfaces will be represented in the flow data?
Flow export rate -- what is the total volume of flow data (measured in flows/second)?
Raw flow storage -- how many days/weeks of raw flow data should be kept online for ad hoc reporting?
Traffic category count -- how many different types of traffic categories will be identified (including both general categories like video, web, and email and any more detailed categories like "untagged g.711 audio" and "fred's development server")
Resolution and history of aggregated data -- how much history is to be stored at 1-minute sampling, 5-minute sampling, 60-minute sampling, and 24-hour sampling.

Here are some general guidelines:

Flow rate depends entirely on the applications in use. Here are some enterprise guidelines:
- WAN router - about 15 flows/second for each Mbps of used bandwidth (e.g., a router with a 10 Mbps WAN circuit that averages 5 Mbps of actual usage might generate 75 flows/second)
- Data center switches - about 3 flows/second per Mbps. (e.g., a 10 Gbps inter-data center link carrying about 3 Gbps of traffic might generate 9,000 flows/second.
CPU
- Multi-core processing spreads the processing workload nicely when there are many exporters
- Higher clock-rates will result in linearly better performance.
- An AMD Opteron 1.9 GHz or Intel E7-4850 2.0 GHz CPU have been observed processing:
  - default categories - 25,000 flows/second per core
  - typical custom cagtegories - 10,000 flows/second per core
  - complex custom categories - 7,000 flows/second per core
Memory
- 512MB is fine for typical systems handling less than 5,000 flows/second.
- For larger systems, allocate 512MB per core plus 4GB if you plan to use rrdcached to optimize the disk I/O (see below).
Storage. Webview has two very different storage needs:
- Raw flow warehousing
  - Each raw flow consumes about 18 bytes. E.g., a large network with 50,000 flows/second will require 74 GB for each day of raw flow history, or 2.5TB for a month of history,
  - The flow data is written just once and read during ad hoc reports run by the web user. The speed of those reports depends on the speed of this disk.
  - It's fine to use cheaper/slower SATA drives (tier 2 or tier 3).
- Aggregate storage (RRD file format)
  - Webview's default aggregation is:
    - one day of 1-minute resolution
    - two weeks of 5-minute resolution
    - 90 days of 60-minute resolution
    - 720 days of 1-day resolution
  - With these defaults, each interface + category requires 3MB for aggregate storage. For example, 50 routers with 4 interfaces each and 25 traffic categories would consume about 50 * 4 * 25 * 3MB = 15 GB of disk.
  - This aggregate data is stored in RRD files, which:
    - are read and written constantly in the background (the read/write ratio is 50/50)
    - require very high I/O Operations Per Second (IOPS)
  - These files should be stored on fast 10K/15K RPM SAS drives configured in RAID groups (tier 1 or tier 2). Flash caching would also help
  - rrdcached is an optional utility that reduces the I/O burden by using a very large in-memory cache. If needed, ensure that the host has 4-8GB of extra memory.

History and author

The Webview Netflow package was primarily written by Craig Weinhold (craig.weinhold@cdw.com), a Cisco network engineer and CCIE. This software has been developed to address real-world needs in medium and large-size enterprise networks that the author has worked with over the years. This GPL'ing of this software was supported by the author's employer, CDW, a company that not only moves a lot of boxes through its warehouses, but also provides top-notch IT professional services around Cisco, Microsoft, IBM, NetApp, EMC, and other vendors. Our specialties include unified communications, security, data center, WAN, storage, and systems. Not surprisingly, CDW would be happy to help you with enterprise Netflow design and planning services, including the support of Webview Netflow Reporter. For more information on these commercial services, consult the Contact us page at http://www.cdw.com/content/services/professional-services.aspx.

Sourceforge.net project site
last updated: 16-June-2013

Table of contents

Flow collection

Robust exporter management

Flow processing