Performance Testing with Traffic Load Modelling on the Viseum ATM Link
By Duncan Napier
The ACT Cinemage Group & Napier Systems Research
Vancouver, British Columbia, Canada.
3. Network Traffic Load Modeling and Terminology
Figure 1 Schematic of Viseum network
Figure 2. Bursting terminology.
Figure 3. Schematic of data transfer terminology.
Figure 4. Network traffic throughput versus TCP block size.
Figure 5 Networking load on UltraSPARC workstation CPU versus TCP block size
Figure 6 Throughput per session based on number simultaneous of sessions.
Figure 7 Graph of UDP block size versus throughput through Viseum network.
Figure 9 . Average delay per block in seconds plotted against utilization.
Figure 11 Delays in transmission versus the transmission time.
Performance Testing of the Viseum ATM Link with a Traffic Load Simulator
By Duncan Napier
Advanced Cultural Technologies Inc & Napier System Research
Vancouver, BC.
The objective of these tests is to obtain quantitative performance benchmarking data for the Viseum ATM Network link between Advanced Cultural Technologys Vancouver, Canada site and the site at Birkbeck College, in London, UK.
The test configuration comprised two Sun SPARCstations running Soalris 2.5.1. The Vancouver machine, Quadra, is a Sun 4u configuration (SPARC Ultra 1) with 128 Mb of Main memory. The Birkbeck machine, Vasari, is a dual-processor Sun SAPRC 4m with 64 Mb of main memory. Both machines use Newbridge VIVID network interface cards and the are linked through the Viseum network with Newbridges VIVID technology. VIVID uses the ATM Forums Multiprotocol Over ATM (MPOA) standard for routing IP over ATM. A Route Server Manager at Birkbeck College initiates and establishes traffic flow between the end nodes. Once connection is established traffic shapers at the nodes carry out flow control and multiplexing of traffic signalling and data channels between VIVID Workgroup Switches (Figure 1).

Figure 1 Schematic of Viseum network
Testing for this report was carried out at the TCP and UDP transport levels. Load modeling was carried out with NetSpec (V. 2.0 and 3.0), a programmable TCP/UDP packet source and network traffic simulation tool developed by Roel Jonkman and others at the ITTC at the University of Kansas, (http://www.ittc.ukans.edu/netspec).
3 Network Traffic Load Modeling and Terminology
Testing has been subdivided into three categories based on TCP/IP network traffic load model that was used:
Full-stream testing measures the maximum throughput of the network and its components. This is implemented by writing data from memory to the network interface as quickly as possible. The result is a continuous stream of TCP blocks.
Constant bit rate measurements use bursts of network traffic that are delivered at regular intervals. The burst output is queued in Netspec to avoid interrrupting or overwriting previous bursts. This gives significantly different results from unqueued output for the cases when the burst duration is of the order of bursting period.
Variable bit rate measurements use randomly sized blocks of data that are bursted over random intervals. The block sizes and intervals lie within assigned limits and conform to an exponential distribution.
The burst period is the time interval between successive bursts (Figure 2).
The block size is the length or size of the TCP data field.
The arrival rate is the rate at which the transmitting host transfers data to the network interface.
The departure rate is the rate at which data is transmitted (Tx) from the network interface across the network (also referred to as the throughput in this work).
Figure 2. Bursting terminology.
The utilization is the ratio (arrival rate)/(departure rate) (Figure 3).
The average delay for bursty traffic is estimated from Littles formula, (average queue length)/(average arrival rate).
Figure 3. Schematic of data transfer terminology.
A full stream of TCP blocks with varying data field sizes was transmitted continuously for a fixed period of time. It was demonstrated that there was no significant degradation in throughput over a transmission times spanning 10 to 600 seconds, and all test results shown are from transmissions that were 30 seconds in duration. Data was collected from a series of 8 or more repetitions of a test transmission. The TCP window parameter was set to 32 Kb for optimal transmission speeds. Analysis of the multiple measurements showed a 5% deviation in the results of each repetition. The results presented are the average of 8 or more measurements. Only transmission data is shown. It was observed that transmit (Tx) and receive (Rx) results varied by less than 1% for TCP. For UDP traffic, only the Rx rates are shown, since a high number of UDP packets were observed to be discarded, even at low utilization levels.
TCP blocks of fixed data field sizes (12500 bytes each) were transmitted at specific time intervals between 80 and 1000 milliseconds. This was designed to test a wide range of network utilization numbers. Each test was repeated 2-3 times and the reported throughput and mean queue lengths were analyzed. The duration of each test was typically 10 seconds. Only Tx data is shown. The round trip delay for the test configuration was determined to be about 190 ms.
TCP blocks of varying size and burst interval times were transmitted. The burst interval times varied from a low of 20 ms to 200 ms. The size of the TCP blocks were varied from 8 to 1048586 bytes in size and followed an exponential distribution, with an average block size of 256258 bytes. The statistical scatter for this type of data source model was large and the results of single runs (Tx data only) ranging in duration from 10 to 120 seconds are shown.
It was found that the network was able to sustain about of about 1.75 Megabits per second (Mbps) of full stream TCP traffic over extended periods of time for a single user. The system throughput showed a slight degradation in performance as the size of the TCP data field was increased (Figure 4). This was attributed to buffer management, since the workload on the source and destination machines actually declines as block size is increased (Figure 5). The network shows excellent scalability in handling an increase in the number of simultaneous sessions (Figure 6).
Figure 4 is a graph of measured throughput of a continuous stream of TCP blocks plotted against TCP data field size. It shows that smaller block sizes result in slightly higher throughput rates. As the block size increases to 16 Kbyte and larger, there appears to be a slight slowdown.

Figure 4. Network traffic throughput versus TCP block size.
Figure 5 Networking load on UltraSPARC workstation CPU versus TCP block size
This is attributed to buffer congestion, as Figure 5 shows that the machine activity fell considerably as the block size is increased. This is due to the fact that a continuous stream of small blocks imposes more processing overhead at the transmitting and receiving ends. Larger blocks tend to challenge the buffer system. A very gradual increase in throughput for very large block sizes is also observed in Figure 4. This is mirrored by an increase in system activity as well in Figure 5 The results reported were recorded over multiple 30-second trials.

Figure 6 Throughput per session based on number simultaneous of sessions.
A graph of the throughput per session versus number of simultaneous sessions is given in Figure 6. It shows the excellent scalability as more sessions are added to the system. The trials were conducted using a continuous 30 second streams of 16 Kbyte TCP blocks. The throughput was shown to be steady over a variety of time intervals ranging from 10 seconds to 10 minutes. Figure 6 also shows that total throughput
appears to increase as the number of concurrent sessions rises. This was attributed to overhead in the TCP resulting from transmission control (receiver acknowledgement and packet sequencing) and flow management (TCP window). Measurements using UDP traffic were used to determine the extent to which the mapping of TCP to lower ATM levels affects throughput. This is justified since UDP is unreliable and connectionless and does not guarantee delivery, preserve sequence or detect duplication, and in fact does virtually nothing except add port-addressing capability to IP. Figure 7 shows that the peak receiving rate for UDP blocks is about 3.3 Mbps, roughly twice that for a single TCP-based session.

Figure 7 Graph of UDP block size versus throughput through Viseum network.
b) Continuous Bit-rate Traffic
Generating timed bursts of TCP traffic controlled the arrival rate of packets to the network.
By programming Netspec to emit TCP blocks at specific time intervals, the utilization of the network could be varied by precise amounts. Figure 8 shows the throughput through the ATM network as the burst period. Is changed. The parameters used in Figure 8 were obtained by bursting 12500 byte blocks over various intervals, shown on the horizontal axis. The arrival rate is the blocksize divided by the burst period and is shown by the dotted line in Figure 8. The data was acquired over 10 second intervals. Figure 8 shows that arrival and departure rates are virtually identical until the burst interval falls below about 80 ms. At this point the arrival rate approaches and then starts to exceeds the network transmission rate. The throughput eventually plateaus around 1.7 Mbps, the peak rate for TCP. Beyond this point, the network utilization rises over 1 (that is, the arrival rate begins to exceed the departure rate/ATM throughput).
The system delays as utilization of the ATM network increases is shown in Figure 9. Figure 9 shows that there are virtually no delays at the traffic transmission level in the Viseum network until network utilization is very close to 1. Delays increase sharply as burst periods for the fixed-size 12500 byte blocks fall below 100 ms and the utilization goes to 1. Figure 9 also demonstrates the high degree of traffic buffering that the Viseum network can accommodate. Typically, systems in which the network utilization exceeds 1 will eventually become unstable. Figure 9 shows that the Viseum network can accommodate packet delays exceeding 10 seconds (higher arrival rates resulted in fatal errors in the network) over a 10 second test duration. For purposes of comparison, a conventional Ethernet network was subjected to utilizations exceeding 1 over 10 second test durations. The Ethernet network produced fatal errors by timing out when the utilization exceeded about 1.3 and the delay times exceeded 1 second. This data demonstrates the ability of the Viseum network to buffer and sustain high-speed transmissions over short periods of time.

Figure 9 . Average delay per block in seconds plotted against utilization.

Programming Netspec to burst variable TCP blocksizes over variable time intervals allowed the simulation of the flow of variable bit rate traffic through the Viseum network. The burst periods varied randomly between 20 and 200 ms while block sizes vary according to an exponential distribution (ie the probability, f(b) of transmitting a block of size b is k*exp(-bk), where the average blocksize is 1/k). The average block size for these tests was 254586 bytes. Figure 10 shows the utilization over a series of transmission times.
Over small time intervals, the arrival rates vary significantly, with the network utilization eventually stabilizing around 11. Once again, the Viseum network shows excellent buffering and delay control.
Figure 11 shows the delay over increasing transmission times. The increase is virtually linear over transmission time and demonstrates that length of the transmission queue into the Viseum network grows in proportionally in time. Buffer congestion results after just over 120 seconds, causing fatal network errors. In Figure 11, a two minute transmission with arrival times exceeding 10 times the ATM capacity causes delays of almost 60 seconds per block (!). This represents a delay of several hundred-fold in burst duration and round trip delay time and would cause unacceptable delays in real-time or interactive transmissions. Note that provided the transmission times are not exceedingly long (ie less than 2.5 minutes in this case), the Viseum network could still sustain this rate data transfer, but at slower, buffered rate.

Figure 11 Delays in transmission versus the transmission time.
(i)Implementing Netspec model traffic types, including WWW, CBR Voice traffic, Teleconference Video, MPEG video.