Home
Reading
Searching
Subscribe
Sponsors
Statistics
Posting
Contact
Spam
Lists
Links
About
Hosting
Filtering
Features Download
Marketing
Archives
FAQ
Blog
 
Gmane
From: Peter Haag <phaag-Rn4VEauK+AKRv+LV9MX5uipxlwaOVQ5f <at> public.gmane.org>
Subject: Re: flow sequence errors and pkt receive errors
Newsgroups: gmane.network.netflow.nfdump.general
Date: Wednesday 23rd March 2011 08:56:46 UTC (over 5 years ago)
Hi Jakub,

On 3/22/11 22:20, Jakub Słociński wrote:
> Hi all,
> I've noticed sequence errors in nfcapd logs. No idea how to fix that,
> suppose this could be connected with too high amount of data, but in fact
> collector should play with that all without problems (there are still
free
> resources)
> 

Sequence errors can occur somewhere from the router to the collector.
Either
the router drops flows, due to full a flow table, or UDP packets get
dropped
somewhere. It can be pretty hard to search the bottleneck.
Your socket buffer is already 1Meg. If you think, it could be an disk I/O
problem, that nfdump loses packets while flushing the buffer, try to
increase
the socket buffer. Internally nfdump keep a memory buffer and stores
incoming
processed netflow records into this memory buffer, before flushing the
buffer
to disk. On a busy system, you should run multiple collectors in order to
prevent a socket bottleneck.
We collect around 120GB netflow data a day ( compressed ) and have maybe
10 sequence errors a day. Unfortunately I do not see the RcvbufErrors on
our Debian .. but due to the little sequence errors, I believe there are
not so many.
Most of the time I/O is the biggest concern. If you have the same behaviour
on a memory file system, it must be something else ..

I'd be interested in the experience of other users.

	- Peter

> == cut ==
> Mar 22 21:15:00 collector nfcapd[7125]: Total ignored packets: 0
> Mar 22 21:20:00 collector nfcapd[7125]: Ident: 'none' Flows: 28932919,
> Packets: 219113456, Bytes: 176178612257, Sequence Errors: 164, Bad
Packets:
> 0
> Mar 22 21:20:00 collector nfcapd[7125]: Total ignored packets: 0
> Mar 22 21:25:00 collector nfcapd[7125]: Ident: 'none' Flows: 28932000,
> Packets: 223251341, Bytes: 180395194650, Sequence Errors: 180, Bad
Packets:
> 0
> Mar 22 21:25:00 collector nfcapd[7125]: Total ignored packets: 0
> Mar 22 21:28:27 collector nfcapd[7125]: Process_v9: Found options
flowset:
> template 256
> Mar 22 21:28:29 collector last message repeated 7 times
> Mar 22 21:30:00 collector nfcapd[7125]: Ident: 'none' Flows: 29034349,
> Packets: 219364965, Bytes: 176403387172, Sequence Errors: 101, Bad
Packets:
> 0
> Mar 22 21:30:00 collector nfcapd[7125]: Total ignored packets: 0
> Mar 22 21:35:00 collector nfcapd[7125]: Ident: 'none' Flows: 28876943,
> Packets: 219561859, Bytes: 177397945683, Sequence Errors: 101, Bad
Packets:
> 0
> Mar 22 21:35:00 collector nfcapd[7125]: Total ignored packets: 0
> Mar 22 21:40:00 collector nfcapd[7125]: Ident: 'none' Flows: 28737559,
> Packets: 219781630, Bytes: 178774087213, Sequence Errors: 120, Bad
Packets:
> 0
> Mar 22 21:40:00 collector nfcapd[7125]: Total ignored packets: 0
> Mar 22 21:45:00 collector nfcapd[7125]: Ident: 'none' Flows: 28456744,
> Packets: 220060125, Bytes: 178299135875, Sequence Errors: 154, Bad
Packets:
> 0
> == cut ==
> 
> Traffic ~180G, ~222M pkts, 4.2Gbps
> 
> Second thing I've realized are dropped packets in netstat results (uptime
> 22h):
> Udp:
>     85374738 packets received
>     6651 packets to unknown port received.
>     191300 packet receive errors
>     47320 packets sent
>     RcvbufErrors: 191300
> 
> No errors on interface or other logs.
> Nfdump stores data in /dev/shm, then another process is moving it to
disk,
> but the same problem was when data was stored directly on disk.
> Could it be connected to 5 min timewindow data shift done by nfcapd, so
it
> can not handle high amount of data while saving it / moving to another
file?
> I am doing that in RAM so delay is minimal I think.
> 
> I have increased rmem_default and _max to 10 and 20MB for udp receive.
> Ethtool does RX ring param set to max: 4096.
> No matter if this is on 1gbit ethernet or 10gbit ethernet, nor how many
and
> how fast cores it has. Nfcapd takes aprox. ~2-20% of cpu all the time.
> 
> Nfdump version 1.6.3 with IOS XR fix patch, run with params:
> # nfcapd -T +4,+5 -z -w -D -S 1 -B 1000000 -l /dev/shm/flow -p 9000 -P
> /var/run/pidfile
> No sampling on router. I prefer getting full flow information. No errors
> while exporting from router.
> 
> Do you have any idea what to check more or change? Could divide all
traffic
> into multiple ports/nfcapd processes help?
> I suspect data loss. Comparing to switchport info netflow doesn't count
all
> values properly I think.
> 
> Thanks a lot for your time and any help,
> 
> 
> 
>
------------------------------------------------------------------------------
> Enable your software for Intel(R) Active Management Technology to meet
the
> growing manageability and security demands of your customers. Businesses
> are taking advantage of Intel(R) vPro (TM) technology - will your
software 
> be a part of the solution? Download the Intel(R) Manageability Checker 
> today! http://p.sf.net/sfu/intel-dev2devmar
> 
> 
> 
> _______________________________________________
> Nfdump-discuss mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/nfdump-discuss

-- 
--
Be nice to your netflow data

------------------------------------------------------------------------------
Enable your software for Intel(R) Active Management Technology to meet the
growing manageability and security demands of your customers. Businesses
are taking advantage of Intel(R) vPro (TM) technology - will your software 
be a part of the solution? Download the Intel(R) Manageability Checker 
today! http://p.sf.net/sfu/intel-dev2devmar
 
CD: 3ms