Clarification on behaviour of Iperf's -w option
Hi all,
I'm a new user of Iperf and I've been using v2.0.2 on FreeBSD 6.2 for
some TCP research I'm involved with. I was hoping you could help clarify
some confusion I'm having with Iperf's "-w" (TCP window size) option.
Using a new tool we developed named SIFTR
(http://caia.swin.edu.au/urp/newtcp/tools.html), we are able to
investigate the internal state of TCP connections on our testbed,
getting details like congestion window, send window and receive window.
We have been using Iperf to test SIFTR.
We ran an Iperf server on one FreeBSD 6.2 machine, connected to an
identical FreeBSD 6.2 machine via a cross over cable which ran as the
Iperf client.
When we run "iperf -s" on the server, we see the message printed to
screen: "TCP window size: 64.0 KByte (default)"
Having read through the source code, the keyword default indicates that
we have not overridden the system default window size. In FreeBSD, the 2
default socket buffer sizes are controlled by the sysctl variables
"net.inet.tcp.recvspace" and "net.inet.tcp.sendspace". The advertised
window should equate to the size of our receive buffer
(net.inet.tcp.recvspace), which in FreeBSD is 65536 bytes. The Iperf
server tells us the window is 64.0k, 65536/1024 = 64.0 kbytes, so this
is as we would expect i.e. it is correctly reporting the use of the
system's default receive socket buffer size as the advertised window.
However, when we run the Iperf client using the command "iperf -c
10.0.0.3" (where 10.0.0.3 is the IP address of the machine we ran the
iperf server on), we see the message printed to screen: "TCP window
size: 32.5 KByte (default)".
The value of 32.5k is slightly above the default value of FreeBSD's
net.inet.tcp.sendspace sysctl variable, the value of
which is 32768 bytes. I don't know the reason for this value not being
exactly equal to the sendspace variable... but I do know that in client
mode, Iperf reports the size of the buffer _after_ a connection to the
server has been established. By some mechanism unknown to me within the
kernel, the buffer is sized to be as close to the default as possible,
but is a slightly different size. The comments in the source code of
tcp_window_size.c seem to validate my thoughts, stating that the buffer
size is not validated until a connection has been established i.e the
buffer size printed by iperf in server mode would also be slightly
different to 64.0K if we printed the size when a client connected,
rather than once on startup.
Reporting the value of 32.5k, being based on the send buffer, means we
are _apparently_ advertising the value of our send space buffer as our
TCP window, rather than our receive space buffer.
Attempting to manually override the window on the client using the "-w"
option to Iperf does indeed modify the value reported as the TCP window
by Iperf e.g. running "iperf -c 136.186.229.192 -w 100K" will report
"TCP window size: 100 KByte (WARNING: requested 100 KByte)"
However, further investigation using the SIFTR tool to show what the
kernel thought was going on revealed that the client is _actually_
advertising 64.0 KByte as the window regardless of whether the "-w"
option is set or not i.e. the network stack is still happily advertising
the size of the receive buffer as the initial window, completely
independent to the value Iperf is reporting as the TCP window. We found
that the server side is behaving as expected though and is actually
advertising the size of its receive buffer correctly.
To put this discussion into context, the iperf man page and help output
describe the "-w" option as "TCP window size (socket buffer size)".
Delving into the Iperf code, we found that in tcp_window_size.c, there
is an if/else statement that goes something like this (in pseudo code):
if ( !inSend )
{
set the socket's receive buffer to the value specified by the user
}
else
{
set the socket's send buffer to the value specified by the user
}
inSend is false if Iperf is being run as a server, and true if Iperf is
being run as a client.
The net result of this code is that the "-w" option on an iperf client
modifies the size of the send space buffer, which actually has no effect
on the window advertised by the client to the server.
Based on all of this, it would seem that the "-w" option is behaving in
an unexpected manner, considering the description of the option in the
literature is that it adjusts the TCP window size. One could also argue
that the Iperf client reporting the size of the send buffer and claiming
it to be the TCP window size is also unexpected behaviour.
Can anyone shed some light on this behaviour? Would it be worth creating
and submitting a patch to modify this behaviour such that the -w option
always adjusts the socket's receive buffer which will in turn cause the
advertised window to be equal to the value specified in the -w option?
Perhaps there should be a new command line switch added that allows the
user to manipulate both the send and receive buffer size used for the
Iperf socket?
Thanks for your help!
Cheers,
Lawrence
http://caia.swin.edu.au/