We measured the performance impact of our BSD implementation of HYDRANET on a small testbed, which, for measurement purposes, consists of two Pentium/120 PCs and one 486/33 SX PC.
We did measurements with ttcp to determine the overhead in redirectors and host servers. In our testbed, the 486/33 SX PC is sufficiently slow that it acts as a bottleneck. By swapping the hosts in the configuration, we could have the redirector be the bottleneck, or the receiving host server. We compared the sustained bandwidth of TCP for the following three series of measurements. (For the measurements, we turned off buffering of small datagrams at the TCP sender, preventing it from batching multiple small datagrams into a datagram of MTU size.)
The above comparisons were made for two configurations of the testbed. In a first set of experiments, the 486/33 SX PC was configured as the redirector, which made the redirector be the bottleneck. Figure 3a illustrates the performance. The results indicate that the overhead on the router/redirector is negligible. We can see a small penalty for redirection when the packet size is close to MTU size (1500 Byte) or slightly larger. This is due to the additional datagram fragmentation that is needed in these cases to add the encapsulation header. For example, a TCP message of 1600 Byte is fragmented at the sender into two datagrams, one of MTU size, which is typically 1500 Byte including TCP and IP header, and one with the rest. When the first datagram reaches a redirector, the latter cannot accomodate the encapsulation header within the datagram size limits. It therefore must further fragment the datagram, which causes overhead in form of additional computation at the router and bandwidth on the links. This problem is pervasive with all approaches based on encapsulation.
In the second set of experiments, the 486/33 SX PC is configured as the host server, which makes the host server be the bottleneck. Figure 3b illustrates the performance. We notice a significant drop in sustained bandwidth for the case of redirection. Interestingly enough, the fixed per-packet processing overhead seems to be negligible, as the results for small packets show. This indicates that the overhead must be caused by excessive copying of the contents of datagrams as they are processed by the host server. We are optimistic that these data copies can be eliminated, with a beneficial effect on the performance. In addition, the host server in these experiments was severely overloaded. Running replicated services on severely anemic servers would be a bad choice, independent of the replication scheme used.
Separate measurements of TCP connection-setup latency indicate that connection setups to replicated TCP ports on host servers take only marginally longer (less than 0.1msec) than to traditional ports.