RE: Iperf logging oddity on Linux with iperf 1.7.0


One other item of note. 

When I run a server with 60 incoming clients and a 10 second report interval the retransmission rate as reported by tcptrace on a tcpdump session is arund 1%. Alsom the overall throughput is very stable.

If I take the same clients and change the report interval on the iperf server to 1 or 2 seconds, the retransmission rate jumps to almost 20%. The throughput per client and aggregate is very erratic.

I find this a little confusing because I wouldn't expect the reporting interval to affect performance of the TCP session, but it seems to. Is this to be expected.

Thanks
Paul


-----Original Message-----
From:	Kevin Gibbs [mailto:kgibbs --at-- ncsa.uiuc.edu]
Sent:	Sat 3/13/2004 6:24 PM
To:	iperf-users --at-- dast.nlanr.net
Cc:	
Subject:	RE: Iperf logging oddity on Linux with iperf 1.7.0

Actually the output in both cases is accurate. TCP is easier to explain 
so I will start with that one. As in my previous message about Paul's 
strange reports and TCP processing Iperf shows goodput not throughput. As 
such when a packet is received it flows through several checks. It then 
checks to see if it is time to output a periodic update. If it determines 
that it has past a time boundry it reports that data BEFORE adding the 
current packet, it then clears the temporary values and checks to make 
sure that it did not pass multiple periodic updates and reports any that 
have been past (thus these are 0 byte reports). Only after this checking 
does it add the current packets amount to the temporary reporting numbers. 
Thus lets look at the output Paul provided in this light:

> 17:44:13 [ 18] local 172.27.2.28 port 5001 connected with 172.27.2.13
> 17:44:16 [ 18]  0.0- 2.0 sec  12.8 KBytes  52.5 Kbits/sec
This report is generated after the first "packet" past the 2 second 
interval arrives. That packet will be included in the next report.

> 17:44:18 [ 18]  2.0- 4.0 sec  11.4 KBytes  46.7 Kbits/sec
This report is likewise generated by the first "packet" past the 4 second 
interval. That packet will be included in the next report.

> 17:44:23 [ 18]  4.0- 6.0 sec  2.85 KBytes  11.7 Kbits/sec
> 17:44:23 [ 18]  6.0- 8.0 sec  0.00 KBytes  0.00 Kbits/sec
This report was generated by a "packet" arriving at 44:23 resulting in two 
reports being due. Between 44:18 and 44:20 2.85KB worth of "packets" 
arrived. Also between 44:20 and 44:22 no "packets" were received. The 
"packet" arriving at 44:23 is then added to the next report as it should 
be.

> 17:44:27 [ 18]  8.0-10.0 sec  1.43 KBytes  5.84 Kbits/sec
> 17:44:27 [ 18] 10.0-12.0 sec  0.00 KBytes  0.00 Kbits/sec
Here again a "packet" arriving at 44:27 causes two reports to be generated 
one of which has no data because none was received during that period. And 
this packet is added to the 12-14 period because 44:27 - 44:13 = :14 with 
some rounding in there it is probably closer to say 13.5 seconds into the 
transfer.

I quoted all the above packets because these are "packets" as seen by the 
application which are not exactly equal to TCP packets sent on the 
network. TCP is allowed to repackage "packets" from the application to 
the network and actual packets from the network to the application.

Therefore the reports are accurate to what the application saw. This 
does not completely equate to what the network saw with all the 
retransmissions that you say you saw with TCPDump. However the report is 
of the activity above the TCP layer and is accurate. TCP uses many 
mechanisms to prevent spurious activity on both the network and 
application side. One method you could try to "force" TCP to more 
accurately show you what it has received would be to supply a smaller 
length on the server side with the (-l) parameter. If you set it to the 
MSS that the TCP connection uses (often significantly smaller than the 
network maximum frame size) then anytime an in-order segment arrives it 
should alert your application. However you will still get stored up 
out-of-order packets delivered potentially much later than they arrived. 
To get the MSS used by the connection look at the size of application data 
in 95% of the actual packets received in TCPDump. Then if that number is 
say 564 start the server:

iperf -s -l 564

Let me know if this provides "smoother" results. One potential problem 
with shrinking the length of receives is that you increase the amount of 
processing required for the stream and that may slow down the connection. 
However if you are doing low bandwidth testing with fast idle machines it 
should not hurt anything.

Now UDP on the other hand is a wholely different bag of worms. Jim is 
correct that Matt's observations can not be attributed to retransmission 
and TCP mechanisms since UDP provides no such mechanisms. My guess is that 
for whatever reason the client is not sending any packets during the 0 
byte periods. This is mostly due to scheduling issues. Iperf performs busy 
waiting to try and time the packets it sends on the network to be as close 
to CBR as possible. However it can be interrupted by just about anything 
for extended periods of time. Unlike most TCP stack implementations, Iperf 
does not get scheduled by the kernel every 200ms and on every interrupt 
from the NIC on the receiving of an ACK to create the ability to have 
control of the CPU when we want it. The same problems can exist on the 
server side where data is waiting for the socket but the thread does not 
get control of the CPU immediately to return from the recv() and process 
the results. I would be interested in a TCPDump along with a utilization 
on the serverside during a run that has 0 bytes for an interval when the 
rest are pushing 900Mb/s.

Kevin

PS: Before my time with Iperf there were no 0 byte intervals, however the 
intervals were never exact. And you would have received a report for the 
Paul's run that looked like:

17:44:13 [ 18] local 172.27.2.28 port 5001 connected with 172.27.2.13
17:44:16 [ 18]  0.0- 2.0 sec  12.8 KBytes  52.5 Kbits/sec
17:44:18 [ 18]  2.0- 4.0 sec  11.4 KBytes  46.7 Kbits/sec
17:44:23 [ 18]  4.0- 8.6 sec  2.85 KBytes  5.7 Kbits/sec
17:44:27 [ 18]  8.6-12.2 sec  1.43 KBytes  2.84 Kbits/sec

or something. Now to me this is flat out wrong. Here you not only have 
fluctuations in the transfer amount but the time intervals are variable! 
Granted you do get a average on the throughput for the 4 second intervals 
but if the application received NO data between 6.0-8.0 why not 
just come out and say it? The current implementation provides much more 
information as to when data was received.

PPS: Sorry for the HUGE response but this is probably the "short" version. 
=-)

On Thu, 11 Mar 2004, Jim Ferguson wrote:

> Paul,
> 
> This observation is something that we need to look into before pushing out 
> a 2.0 version.  A fair amount of "code cleanup" was done between the 1.65 
> and 1.70 versions, and it appears this 'short interval' behavior has been 
> repeatable.  Thanks for sending the report, along with the tidbit that 
> tcpdump was showing significant retransmissions.  I'm guessing that that is 
> *not* the case for Matt's observations, however...?
> 
> -Jim
> 
> At 08:46 AM 3/11/2004 -0600, you wrote:
> >TCPdump shows a much more consistent rate *although there are a lot of 
> >retransmissions happening). The expected rate is about 10-50K depending on 
> >fluctuations in bw allocations on the wireless system.
> >
> >I think this all goes back to iperf reports not being purely time based 
> >and the code not being able to properly detect when it is going to 
> >generate multiple reports at the same time and thus produce inaccurate 
> >reports for some intervals.
> >
> >If I change the reporting interval to 10 seconds, this is less likely to 
> >happen and the TCP connections actually perform better (fewer 
> >retransmissions and better rate).
> >
> >Paul
> >
> >
> >-----Original Message-----
> >From: Matthew J Zekauskas [mailto:matt --at-- internet2.edu]
> >Sent: Wednesday, March 10, 2004 7:30 PM
> >To: iperf-users --at-- dast.nlanr.net
> >Subject: Re: Iperf logging oddity on Linux with iperf 1.7.0
> >
> >
> >--On Wednesday, March 10, 2004 6:07 PM -0600 paul shields 
> ><pshields --at-- NAVINI.com> wrote:
> >
> > > Below is a snippet of a log from a Linux server with the report 
> > interval set for 2 seconds and
> > > timestamps added as to exactly when iperf generated the line of output. 
> > The handling of time gaps
> > > seems incorrect. (notice timestamps are the same for some report 
> > intervals and all of the data
> > > rate is attributed to the fist interval print at that time. Iperf seems 
> > to want to "catch up" on
> > > the reporting process and generates a lot of false 0.0 output lines.
> > >
> > > The report should generate output for the combined 4-8 interval instead 
> > of creating two intervals
> > > and incorrectly reporting the data rate distribution between them.
> > >
> > > Comments?
> > > Paul
> > >
> > > 17:44:13 [ 18] local 172.27.2.28 port 5001 connected with 172.27.2.13 
> > port 1215
> > > 17:44:16 [ 18]  0.0- 2.0 sec  12.8 KBytes  52.5 Kbits/sec
> > > 17:44:18 [ 18]  2.0- 4.0 sec  11.4 KBytes  46.7 Kbits/sec
> > > 17:44:23 [ 18]  4.0- 6.0 sec  2.85 KBytes  11.7 Kbits/sec
> > > 17:44:23 [ 18]  6.0- 8.0 sec  0.00 KBytes  0.00 Kbits/sec
> > > 17:44:27 [ 18]  8.0-10.0 sec  1.43 KBytes  5.84 Kbits/sec
> > > 17:44:27 [ 18] 10.0-12.0 sec  0.00 KBytes  0.00 Kbits/sec
> > > 17:44:30 [ 18] 12.0-14.0 sec  4.28 KBytes  17.5 Kbits/sec
> > > 17:44:30 [ 18] 14.0-16.0 sec  0.00 KBytes  0.00 Kbits/sec
> > > 17:44:32 [ 18] 16.0-18.0 sec  5.70 KBytes  23.4 Kbits/sec
> > > 17:44:37 [ 18] 18.0-20.0 sec  1.50 KBytes  6.16 Kbits/sec
> > > 17:44:37 [ 18] 20.0-22.0 sec  0.00 KBytes  0.00 Kbits/sec
> > > 17:44:41 [ 18] 22.0-24.0 sec  1.43 KBytes  5.84 Kbits/sec
> > > 17:44:41 [ 18] 24.0-26.0 sec  0.00 KBytes  0.00 Kbits/sec
> > > 17:44:41 [ 18] 26.0-28.0 sec  1.43 KBytes  5.84 Kbits/sec
> > > 17:44:44 [ 18] 28.0-30.0 sec  5.15 KBytes  21.1 Kbits/sec
> >
> >I've seen something similar with iperf 1.7 as well as iperf 2 betas
> >when running under RH9 on gigE-connected systems.  The bitrates
> >are quite a bit higher, but some runs have intervals that report
> >0 bits sent!  Other runs are consistent at 980-ish Mb/s.
> >
> >A related observation is that when I try to do 900 Mb/s UDP
> >streams, some runs are consistently right around 900 Mb/s,
> >and some runs also have intervals that report 0 bits sent.
> >
> >In both of these cases, the intervals with data do not try
> >to "make up" for the missing bits; overall reports show
> >low bitrate transfers (usually 50 to 200 Mb/s for me :),
> >and no interval ever spikes above the physical maximum
> >bitrate.  (What were you expecting in the runs you show?
> >is 0 bits sent over 2 seconds reasonable for you?  Have
> >you tried any tcpdump or equivalent passive traces?)
> >
> >I believe there may be an issue at the sender... alarms there
> >look like they are not delivered in time either.  Maybe I've
> >got a bogus device driver?  I haven't had time to pursue the
> >matter :(, instead choosing to filter out runs that look to
> >be completely bogus.  This is not the correct long-term solution...
> >
> >--Matt
> 








Other Mailing lists | Author Index | Date Index | Subject Index | Thread Index