RE: Iperf logging oddity on Linux with iperf 1.7.0


Would increasing the receive buffers help alleviate the problem?

Thanks for the explanation
Paul


-----Original Message-----
From: Kevin Gibbs [mailto:kgibbs --at-- ncsa.uiuc.edu]
Sent: Sunday, March 14, 2004 3:16 PM
To: iperf-users --at-- dast.nlanr.net
Subject: RE: Iperf logging oddity on Linux with iperf 1.7.0


Actually this is a nasty side effect of smaller reporting intervals 
especially with large number of clients. With 50 connections trying to all 
print out at 1 second intervals you are trying to print to screen one line 
every 20ms. This is a huge strain. Also in 1.7.0 (and all previous 
versions) the reporting is done in the thread that is receiving. Therefore 
it is not waiting on a recv() call and that can lead to TCP not having 
enough buffer space when it needs it and thus causing packets to get 
dropped (by TCP not the network) and then need to be retransmitted. 
Whenever retransmission happens the throughput will be erratic because 
slow start can restart and congestion avoidance mechanisms throttle back 
on the speed. In 2.0 I changed that behavior so that recv threads do as 
little as possible then return to waiting on a recv() call. Then 
one thread is responsible for all the reporting so there are no 
huge synchronization problems when printing. However there is a nasty heap 
or stack corruption bug that is limiting the number of connections to < 6 
on the current beta. Hope that makes sense.

Kevin

On Sat, 13 Mar 2004, paul shields wrote:

> One other item of note. 
> 
> When I run a server with 60 incoming clients and a 10 second report interval the retransmission rate as reported by tcptrace on a tcpdump session is arund 1%. Alsom the overall throughput is very stable.
> 
> If I take the same clients and change the report interval on the iperf server to 1 or 2 seconds, the retransmission rate jumps to almost 20%. The throughput per client and aggregate is very erratic.
> 
> I find this a little confusing because I wouldn't expect the reporting interval to affect performance of the TCP session, but it seems to. Is this to be expected.
> 
> Thanks
> Paul
> 
> 
> -----Original Message-----
> From:	Kevin Gibbs [mailto:kgibbs --at-- ncsa.uiuc.edu]
> Sent:	Sat 3/13/2004 6:24 PM
> To:	iperf-users --at-- dast.nlanr.net
> Cc:	
> Subject:	RE: Iperf logging oddity on Linux with iperf 1.7.0
> 
> Actually the output in both cases is accurate. TCP is easier to explain 
> so I will start with that one. As in my previous message about Paul's 
> strange reports and TCP processing Iperf shows goodput not throughput. As 
> such when a packet is received it flows through several checks. It then 
> checks to see if it is time to output a periodic update. If it determines 
> that it has past a time boundry it reports that data BEFORE adding the 
> current packet, it then clears the temporary values and checks to make 
> sure that it did not pass multiple periodic updates and reports any that 
> have been past (thus these are 0 byte reports). Only after this checking 
> does it add the current packets amount to the temporary reporting numbers. 
> Thus lets look at the output Paul provided in this light:
> 
> > 17:44:13 [ 18] local 172.27.2.28 port 5001 connected with 172.27.2.13
> > 17:44:16 [ 18]  0.0- 2.0 sec  12.8 KBytes  52.5 Kbits/sec
> This report is generated after the first "packet" past the 2 second 
> interval arrives. That packet will be included in the next report.
> 
> > 17:44:18 [ 18]  2.0- 4.0 sec  11.4 KBytes  46.7 Kbits/sec
> This report is likewise generated by the first "packet" past the 4 second 
> interval. That packet will be included in the next report.
> 
> > 17:44:23 [ 18]  4.0- 6.0 sec  2.85 KBytes  11.7 Kbits/sec
> > 17:44:23 [ 18]  6.0- 8.0 sec  0.00 KBytes  0.00 Kbits/sec
> This report was generated by a "packet" arriving at 44:23 resulting in two 
> reports being due. Between 44:18 and 44:20 2.85KB worth of "packets" 
> arrived. Also between 44:20 and 44:22 no "packets" were received. The 
> "packet" arriving at 44:23 is then added to the next report as it should 
> be.
> 
> > 17:44:27 [ 18]  8.0-10.0 sec  1.43 KBytes  5.84 Kbits/sec
> > 17:44:27 [ 18] 10.0-12.0 sec  0.00 KBytes  0.00 Kbits/sec
> Here again a "packet" arriving at 44:27 causes two reports to be generated 
> one of which has no data because none was received during that period. And 
> this packet is added to the 12-14 period because 44:27 - 44:13 = :14 with 
> some rounding in there it is probably closer to say 13.5 seconds into the 
> transfer.
> 
> I quoted all the above packets because these are "packets" as seen by the 
> application which are not exactly equal to TCP packets sent on the 
> network. TCP is allowed to repackage "packets" from the application to 
> the network and actual packets from the network to the application.
> 
> Therefore the reports are accurate to what the application saw. This 
> does not completely equate to what the network saw with all the 
> retransmissions that you say you saw with TCPDump. However the report is 
> of the activity above the TCP layer and is accurate. TCP uses many 
> mechanisms to prevent spurious activity on both the network and 
> application side. One method you could try to "force" TCP to more 
> accurately show you what it has received would be to supply a smaller 
> length on the server side with the (-l) parameter. If you set it to the 
> MSS that the TCP connection uses (often significantly smaller than the 
> network maximum frame size) then anytime an in-order segment arrives it 
> should alert your application. However you will still get stored up 
> out-of-order packets delivered potentially much later than they arrived. 
> To get the MSS used by the connection look at the size of application data 
> in 95% of the actual packets received in TCPDump. Then if that number is 
> say 564 start the server:
> 
> iperf -s -l 564
> 
> Let me know if this provides "smoother" results. One potential problem 
> with shrinking the length of receives is that you increase the amount of 
> processing required for the stream and that may slow down the connection. 
> However if you are doing low bandwidth testing with fast idle machines it 
> should not hurt anything.
> 
> Now UDP on the other hand is a wholely different bag of worms. Jim is 
> correct that Matt's observations can not be attributed to retransmission 
> and TCP mechanisms since UDP provides no such mechanisms. My guess is that 
> for whatever reason the client is not sending any packets during the 0 
> byte periods. This is mostly due to scheduling issues. Iperf performs busy 
> waiting to try and time the packets it sends on the network to be as close 
> to CBR as possible. However it can be interrupted by just about anything 
> for extended periods of time. Unlike most TCP stack implementations, Iperf 
> does not get scheduled by the kernel every 200ms and on every interrupt 
> from the NIC on the receiving of an ACK to create the ability to have 
> control of the CPU when we want it. The same problems can exist on the 
> server side where data is waiting for the socket but the thread does not 
> get control of the CPU immediately to return from the recv() and process 
> the results. I would be interested in a TCPDump along with a utilization 
> on the serverside during a run that has 0 bytes for an interval when the 
> rest are pushing 900Mb/s.
> 
> Kevin
> 
> PS: Before my time with Iperf there were no 0 byte intervals, however the 
> intervals were never exact. And you would have received a report for the 
> Paul's run that looked like:
> 
> 17:44:13 [ 18] local 172.27.2.28 port 5001 connected with 172.27.2.13
> 17:44:16 [ 18]  0.0- 2.0 sec  12.8 KBytes  52.5 Kbits/sec
> 17:44:18 [ 18]  2.0- 4.0 sec  11.4 KBytes  46.7 Kbits/sec
> 17:44:23 [ 18]  4.0- 8.6 sec  2.85 KBytes  5.7 Kbits/sec
> 17:44:27 [ 18]  8.6-12.2 sec  1.43 KBytes  2.84 Kbits/sec
> 
> or something. Now to me this is flat out wrong. Here you not only have 
> fluctuations in the transfer amount but the time intervals are variable! 
> Granted you do get a average on the throughput for the 4 second intervals 
> but if the application received NO data between 6.0-8.0 why not 
> just come out and say it? The current implementation provides much more 
> information as to when data was received.
> 
> PPS: Sorry for the HUGE response but this is probably the "short" version. 
> =-)
> 
> On Thu, 11 Mar 2004, Jim Ferguson wrote:
> 
> > Paul,
> > 
> > This observation is something that we need to look into before pushing out 
> > a 2.0 version.  A fair amount of "code cleanup" was done between the 1.65 
> > and 1.70 versions, and it appears this 'short interval' behavior has been 
> > repeatable.  Thanks for sending the report, along with the tidbit that 
> > tcpdump was showing significant retransmissions.  I'm guessing that that is 
> > *not* the case for Matt's observations, however...?
> > 
> > -Jim
> > 
> > At 08:46 AM 3/11/2004 -0600, you wrote:
> > >TCPdump shows a much more consistent rate *although there are a lot of 
> > >retransmissions happening). The expected rate is about 10-50K depending on 
> > >fluctuations in bw allocations on the wireless system.
> > >
> > >I think this all goes back to iperf reports not being purely time based 
> > >and the code not being able to properly detect when it is going to 
> > >generate multiple reports at the same time and thus produce inaccurate 
> > >reports for some intervals.
> > >
> > >If I change the reporting interval to 10 seconds, this is less likely to 
> > >happen and the TCP connections actually perform better (fewer 
> > >retransmissions and better rate).
> > >
> > >Paul
> > >
> > >
> > >-----Original Message-----
> > >From: Matthew J Zekauskas [mailto:matt --at-- internet2.edu]
> > >Sent: Wednesday, March 10, 2004 7:30 PM
> > >To: iperf-users --at-- dast.nlanr.net
> > >Subject: Re: Iperf logging oddity on Linux with iperf 1.7.0
> > >
> > >
> > >--On Wednesday, March 10, 2004 6:07 PM -0600 paul shields 
> > ><pshields --at-- NAVINI.com> wrote:
> > >
> > > > Below is a snippet of a log from a Linux server with the report 
> > > interval set for 2 seconds and
> > > > timestamps added as to exactly when iperf generated the line of output. 
> > > The handling of time gaps
> > > > seems incorrect. (notice timestamps are the same for some report 
> > > intervals and all of the data
> > > > rate is attributed to the fist interval print at that time. Iperf seems 
> > > to want to "catch up" on
> > > > the reporting process and generates a lot of false 0.0 output lines.
> > > >
> > > > The report should generate output for the combined 4-8 interval instead 
> > > of creating two intervals
> > > > and incorrectly reporting the data rate distribution between them.
> > > >
> > > > Comments?
> > > > Paul
> > > >
> > > > 17:44:13 [ 18] local 172.27.2.28 port 5001 connected with 172.27.2.13 
> > > port 1215
> > > > 17:44:16 [ 18]  0.0- 2.0 sec  12.8 KBytes  52.5 Kbits/sec
> > > > 17:44:18 [ 18]  2.0- 4.0 sec  11.4 KBytes  46.7 Kbits/sec
> > > > 17:44:23 [ 18]  4.0- 6.0 sec  2.85 KBytes  11.7 Kbits/sec
> > > > 17:44:23 [ 18]  6.0- 8.0 sec  0.00 KBytes  0.00 Kbits/sec
> > > > 17:44:27 [ 18]  8.0-10.0 sec  1.43 KBytes  5.84 Kbits/sec
> > > > 17:44:27 [ 18] 10.0-12.0 sec  0.00 KBytes  0.00 Kbits/sec
> > > > 17:44:30 [ 18] 12.0-14.0 sec  4.28 KBytes  17.5 Kbits/sec
> > > > 17:44:30 [ 18] 14.0-16.0 sec  0.00 KBytes  0.00 Kbits/sec
> > > > 17:44:32 [ 18] 16.0-18.0 sec  5.70 KBytes  23.4 Kbits/sec
> > > > 17:44:37 [ 18] 18.0-20.0 sec  1.50 KBytes  6.16 Kbits/sec
> > > > 17:44:37 [ 18] 20.0-22.0 sec  0.00 KBytes  0.00 Kbits/sec
> > > > 17:44:41 [ 18] 22.0-24.0 sec  1.43 KBytes  5.84 Kbits/sec
> > > > 17:44:41 [ 18] 24.0-26.0 sec  0.00 KBytes  0.00 Kbits/sec
> > > > 17:44:41 [ 18] 26.0-28.0 sec  1.43 KBytes  5.84 Kbits/sec
> > > > 17:44:44 [ 18] 28.0-30.0 sec  5.15 KBytes  21.1 Kbits/sec
> > >
> > >I've seen something similar with iperf 1.7 as well as iperf 2 betas
> > >when running under RH9 on gigE-connected systems.  The bitrates
> > >are quite a bit higher, but some runs have intervals that report
> > >0 bits sent!  Other runs are consistent at 980-ish Mb/s.
> > >
> > >A related observation is that when I try to do 900 Mb/s UDP
> > >streams, some runs are consistently right around 900 Mb/s,
> > >and some runs also have intervals that report 0 bits sent.
> > >
> > >In both of these cases, the intervals with data do not try
> > >to "make up" for the missing bits; overall reports show
> > >low bitrate transfers (usually 50 to 200 Mb/s for me :),
> > >and no interval ever spikes above the physical maximum
> > >bitrate.  (What were you expecting in the runs you show?
> > >is 0 bits sent over 2 seconds reasonable for you?  Have
> > >you tried any tcpdump or equivalent passive traces?)
> > >
> > >I believe there may be an issue at the sender... alarms there
> > >look like they are not delivered in time either.  Maybe I've
> > >got a bogus device driver?  I haven't had time to pursue the
> > >matter :(, instead choosing to filter out runs that look to
> > >be completely bogus.  This is not the correct long-term solution...
> > >
> > >--Matt
> > 
> 
> 
> 
> 
> 
> 



Other Mailing lists | Author Index | Date Index | Subject Index | Thread Index