Project 4: Analysis of RTP Packet Delay and Loss

Note: Examples of student project reports will be made available to course instructors upon request ( send email ).

The goal of this project is to analyze delays and loss of RTP packets during a real-time conference session over a wired and wireless network.

Where to find information about RTP/RTCP:

Textbook, Section 3.4
Wikipedia page for RTP and the Wireshark Wiki for RTP
The most detailed and authoritative source is RFC-3550

1. Experiment Description

This project is based on the data collected in WS Project #3. If such data are not available, the student must first perform Project #3 and collect the data to be analyzed in this project.

2. Captured Data Analysis

The following analysis requires drawing several different histograms of observed data. A key issue is selecting the size δ of bins for each histogram, i.e., the time units for its horizontal axis. Try different bin sizes and select one bin size δ_optimal for each histogram as the most informative for the problem at hand. Include in your report all histograms with different bin sizes δ that you experimented with. Discuss the different histogram shapes that you observe and explain how you selected the most informative bin size.

A particularly useful method to uncover temporal behaviors (such as periodicity) in a time series is a recurrence plot. Also see this introduction: Recurrence Plots At A Glance. Related concepts include “Ruelle-Takens phase portrait”, “phase diagram”, or “phase plot”.
You may start by drawing a phase plot, which is a two-dimensional diagram where the horizontal axis represents the current value of a variable, x(t), and the vertical axis represents the next value of the same variable, x(t+1).
Does your phase plot exhibit any patterns? For example, the points may be aligned on or around the diagonal line x(t+1) = x(t), or the points may be scattered around the plane. Describe your observations.

2.1 Calculating the Delay and Delay Jitter from RTP Packets

As this discussion illustrates, extracting the delay jitter from RTP packets is not a straightforward task.

For each media source (such as audio or video, identified by its SSRC number), we will consider: (1) transit delay of individual packets and (2) inter-arrival time between subsequent packets.

2.1.1 Transit Delay of Individual Packets

Wireshark records the arrival time of each packet (find it in the Wireshark packet frame description). However, note that this time may be set relative to the start of the session, rather than the absolute time (known as “wall-clock time”).
Alternatively, the arrival time may be represented as the absolute time but in your geographic location, in which case you need to convert it to the UTC time because NTP timestamps are represented in the UTC time zone.
In addition, we cannot directly use the timestamps found in RTP packets. Recall that the timestamp at the start of each session is initialized randomly and incremented by a fixed amount for each “data segment”, which may be different from a transmitted packet. For audio, a “data segment” usually corresponds to 20 milliseconds of audio recording, which is transmitted in a single packet. For video, a single video frame may be broken and transmitted in several packets, all of which will carry the same timestamp. This situation can be detected by checking the marker (M) bit of the RTP header (see Section 5.1 of RFC-3550). The timestamp clock is incremented by a fixed number for each sampling period. For example, if you transmit audio sampled at the usual τ = 8000 Hertz, the increment unit is 1/8000 of a second. Then, if the sending audio application records, say, 160 sampling periods from a microphone, the timestamp will be increased by 160 units for each such audio segment, regardless of whether the segment is transmitted in a packet or dropped as silent. Therefore, you may observe gaps in timestamps even if none of the packets were lost during transmission.
There may be some issues of aligning the time axes for audio and video, as discussed in “How to calculate effective time offset in RTP” and “here.”

Therefore, before we can start delay calculation, we need to perform the following conversion of RTP timestamps. First, check carefully the format of RTCP Sender Report (SR) packets in Section 6.4.1 of RFC-3550.
Given a stream of RTP/RTCP packets from a source SSRC, you must do this:

Find the first RTCP Sender Report (SR) packet received from this SSRC.
Extract its NTP timestamp (64 bits), which is the “wall-clock time” when this SR packet was sent.
In the same SR packet, after the NTP timestamp, find the corresponding RTP timestamp (32 bits). This timestamp corresponds to the same time as the NTP timestamp (above), but in the same units and with the same random offset as the RTP timestamps in data packets.
For example, if the non-converted RTP timestamp has the value 139680 and the NTP timestamp has the value November 5, 2014 17:59:34.418842000, then you know that they represent the same thing.
Because RTCP SR packets are sent very rarely compared to RTP data packets, you cannot use this conversion for each RTP packet. Instead, use the reference value computed in the previous step for all RTP packets received until the next RTCP SR packet.
For example, assuming that RTP data is audio sampled at 8000 Hertz, and the timestamp in the next RTP packet is 139840, then the actual timestamp is calculated as:
Incremental unit difference: 139840 – 139680 = 160 units
Time difference: 160 / 8000 = 0.02 seconds
Wall-clock time: 17:59:34.418842000 + 0:0:0.02 = 17:59:34.438842000 (on November 5, 2014)
When in the future you receive each subsequent SR packet, you can again re-align your RTP timestamps with the NTP timestamp.

If you are importing the Wireshark-cpatured data to an Excel spreadsheet, you must be careful about the time representation in the spreadsheet. Both the NTP timestamp (extracted from RTCP reports) and the packet arrival timestamp (recorded by Wireshark for each packet when captured ) are represented as HH:MM:SS.ff (i.e., hours+minutes+seconds and fractions of a second). Depending on how you formated the spreadsheet cells, fractions of a second may be truncated.
This is a big problem because individual packet timestamps differ by only order of milliseconds. As a result, the computation of transit delays will be severely flawed because of ignoring the fractions of a second.

Because RTP/RTCP is an unreliable protocol, some RTP or RTCP packets may be lost in transmission. However, the above procedure will work even in the presence of packet loss.
After you performed the above conversion, you are ready to calculate the transit delays. For each packet p_i, calculate the delay as the difference between the packet’s timestamp (in the RTP header) and the time this packet arrived to the receiver:

delay(p_i) = arrival-time(p_i) – timestamp(p_i) To apply this formula, we need to convert the RTP timestamp of each packet from randomly-initiated incremental units to wall-clock time (in time units, such as seconds), as shown in the above example. Therefore, we introduce the following notation:

Š_i denotes the RTP timestamp (in increment units) from the most recent RTCP Sender Report, corresponding to the timestamp in RTP packet i.
Ñ_i denotes the NTP timestamp (in UTC time) from the most recent RTCP Sender Report, representing the time when RTP packet i was prepared.
Š_i+n denotes the RTP timestamp (in increment units) from the RTP packet i+n (n = 1, 2, 3, …), where all these packets are assumed to carry the same SSRC identifier (i.e., are generated by the same source).
Given an RTP packet p_i, we assume that the packet p_i+1 is the first RTP packet from the same SSRC that arrived to the receiver after p_i. And so on for the subsequent packets p_n, n = 2, 3, … Note that the 16-bit sequence numbers of packets p_i and p_i+1 are not necessarily consecutive and ordered, because packets may become reordered or lost during transport. Similarly, it is not necessarily the case that Š_i+n > Š_i.
τ denotes the sampling rate (in Hertz) for the given source SSRC.
R_i+n denotes the time of arrival for packet i+n. Note that the time of arrival on Wireshark may be measured relative to the start of the session, so you may need to convert the arrival times to absolute time. (Or you may need to convert from the absolute time in your time zone to the UTC time zone.)

The timestamp S_i+n (in seconds) is computed as: S_i+n = Ñ_i + (Š_i+n – Š_i) / τ
The transit delay of the packet i+n is computed as: Д_i+n = R_i+n – S_i+n

Generally, although the timestamp conversion is relatively straightforward, it can be tricky because of many small steps and many opportunities to introduce errors.
Keep in mind that network transit delays are random quantities and any regularity (e.g., a repeated pattern such as a seesaw pattern; or a trend such as a linear increase over time; or a sudden large change in values) must be explained. It is hard to believe that such regularities can be anything other than computation errors. Here are some suggestions to check the validity of your computations:

Packets captured from the whole session may count in thousands. It is hard to analyze the entire session chart at once and spot any issues. Instead, zoom in only on an interval between three or four RTCP packets.
Given that transit-delay = arrival-time – RTP-timestamp
plot the curves for all three parameters. That is, in addition to transit delays for packets in the zoomed-in interval, plot also the corresponding arrival times and the RTP timestamps (after conversion to NTP time). To make the curves visible, ignore the hours and minutes in each timestamp and show only seconds and fractions of seconds.
Because RTP timestamps are incremented by a fixed number for each packet at the sender, the RTP-timestamp curve must be a straight line, growing linearly for subsequent packets. Any random jumps or any shape other than a straight line indicates a computation error. For example, your spreadsheet time format may have truncated the fractions of seconds. Or, you may be using incorrect sampling rate τ.
Packet arrival times are random quantities because packets experience random delays while traveling through the network. This curve should jump randomly, but will show an upwards and linear trend for consecutive packets. Any regularly repeated pattern is unusual and must be explained.
The transit-delay curve, which is the difference of arrival time and RTP timestamp must be random and show no long-term trends.

Once you calculate the transit delays of all received packets, do the following separately for each different SSRC identifier:

Draw a timeline of delays, where the horizontal axis shows time from the start to the end of your conferencing session, and the vertical axis shows the delay magnitude of each packet.
Draw a moving average of packet delays and standard deviations on the session timeline.
In addition, draw the cumulative average that averages all the delay values observed until the given time point.
Also draw on the graph the lines for the global average and deviation over the entire dataset at once.
Draw a histogram of packet delays by trying different bin sizes δ; for example, from δ = 10 milliseconds to δ = 1 second per bin. (Drawing phase plots or recurrence plots is considered a plus!)
Histogram Drawing: Suppose that your conferencing session lasted 10 minutes (600 seconds), the minimum observed value of delay is 0.01 seconds and the maximum value is 1.5 seconds. Therefore, the range of values on the horizontal axis of all your histograms is from 0 to 1.5 seconds.
Next, we create the bins. Take, for example, the width of each bin as δ = 0.1 seconds which means that you have total 15 bins (from 0 to 1.5 seconds).
The vertical axis counts the frequency of how many delay values during the entire session (from 0 to 600 seconds) had the value that falls within each bin range. For example, the height of the first bin shows the frequency of delay values within the range of 0 – 0.1 sec. The height of the second bin shows the frequency of delay values within the range of 0.1 – 0.2 sec. The height of the third bin shows the frequency of delay values within the range of 0.2 – 0.3 sec. And so on.

Indicate the measurement units for all charts, either on the axes or in the figure caption.
Discuss whether and how fast the moving average converges to the global average.
Ideally, you should show separate histograms for all experimental scenarios (depending on traffic intensity and data type rtp.p_type — video/audio). Your histogram should be shown so that delay values are along the horizontal axis and the frequency of measurement is along the vertical axis. The scale of the horizontal axis should be from the smallest delay value to the greatest delay value. Indicate the time units for the horizontal axis. Experiment with different bin sizes δ. The scale of the vertical axis should be from zero to the greatest frequency value. (Here “frequency” means how many times you observed a certain delay value over the entire conferencing session.)

Traceroute from one participant to the other should give you an idea about which links seem to have the highest impact on packet delays.

2.1.2 Inter-arrival Time between Subsequent Packets

Inter-arrival time is the time difference between the arrival times of two consecutive packets (no other packets arrived in between). Given that S_i is the timestamp from the packet i and R_i is the time of arrival for packet i, the difference is computed as: D(i–1, i) = (R_i – R_i–1) – (S_i – S_i–1) = (R_i – S_i) – (R_i–1 – S_i–1) Given an RTP packet p_i, we assume that the packet p_i–1 is the first previous RTP packet from the same SSRC that arrived to the receiver before p_i. Note that the 16-bit sequence numbers of packets p_i–1 and p_i are not necessarily consecutive and ordered, because packets may become reordered or lost during transport. (Observe that this notation is similar to the notation used above for transit delay computation.)
Another interesting observation is that inter-arrival time D(i–1, i) as defined above may be negative. It is always true that R_i > R_i–1. However, because of packet reordering and loss during transmission, it is possible that S_i < S_i–1 and |R_i – R_i–1| < |S_i – S_i–1|, in which case D(i–1, i) < 0. For this reason, in delay jitter calculation in the next section we will use the absolute value of D(i–1, i). You may also use the absolute value when plotting the inter-arrival time graphs; however, it may be interesting to consider the negative values as they are.

Once you calculate the inter-arrival time for consecutively-arrived packet pairs, do the following separately for each different SSRC identifier:

Draw a timeline of inter-arrival times, where the horizontal axis shows time from the start to the end of your conferencing session, and the vertical axis shows the inter-arrival value.
Draw a histogram of inter-arrival times by trying different bin sizes δ. (Drawing phase plots or recurrence plots is considered a plus!)

Indicate the measurement units for all charts, either on the axes or in the figure caption.
Discuss your observations.

2.1.3 Delay Jitter

Formally, jitter is defined as a statistical variance of the RTP data packet inter-arrival time (also see this). In the RTP, jitter is measured in timestamp units. For example, if you transmit audio sampled at the usual 8000 Hertz, the unit is 1/8000 of a second.
In RTP, the receiving endpoint computes an estimate using a simplified formula (a first-order estimator), as described in Appendix A.8 of RFC-3550:

J(i) = J(i–1) + ( |D(i–1, i)| – J(i–1) ) / 16 where the value D(i–1, i) is the difference of relative transit times for the two packets.
This page about jitter may help clarify any confusion (particularly see the table that shows an example calculation).

Note that your conferencing application calculates the jitter estimate and sends it to the data sender using RTCP report packets. These report packets are not sent for every estimate but rather the reporting frequency is calculated according to certain rules specified in RFC-3550, which you studied in WS Project #3.
Extract these values from the captured RTCP report packets and use them when drawing the diagrams below.

Once you calculate the jitter over the session timeline, do the following separately for each different SSRC identifier:

Draw a timeline of jitter estimates J(i), where the horizontal axis shows time of each packet arrival i, and the vertical axis shows the jitter estimates J(i).
On the same diagram also show the jitter estimates that you extracted from the captured RTCP report packets.
Draw a histogram of jitter estimates J(i) by trying different bin sizes δ. (Drawing phase plots or recurrence plots is considered a plus!)

Indicate the measurement units for all charts, either on the axes or in the figure caption.
Discuss your observations, particularly how well your jitter estimates agree with those extracted from the captured RTCP report packets.

2.2 Calculating the Fraction of Lost Packets

Analyze the timestamps and sequence numbers of the captured packets. By inspecting the gaps in their sequence numbers, identify the lost packets over the entire session.
To calculate the fraction of lost packets, we need to decide the length of the time interval δ over which the loss is calculated. You should try different interval lengths, say from δ = 500 milliseconds to δ = 1 minute.
Suppose again that your conferencing session lasted 10 minutes (600 seconds) and that you choose δ = 10 seconds, which means that you have 60 such intervals. For each interval, count the number of packets for which the arrival time belongs to this interval. Then examine their sequence numbers and find if there are any gaps (i.e., lost packets). Calculate the fraction of lost packets by dividing the number of missing packets with the total number (arrivid and missing) during the given interval.

Compare your results with the statistics that you can find in RTCP Receiver Reports (RR), which are issued periodically. Each Receiver Report carries both fraction lost since the last RR as well as cumulative number of lost packets (see Section 6.4.2 of RFC-3550). (Recall that Receiver Reports may be combined into the Sender Report packets, instead of being transmitted separately.) Therefore, packet loss rate over the interval between RTCP Receiver Report packets can be directly extracted or derived from the cumulative statistics. The difference in the cumulative number of packets lost gives the number lost during that interval, and the difference in the extended last sequence numbers gives the number of packets expected during the interval. The ratio of these values is the fraction of packets lost. This number should be equal to the fraction lost field in the RR packet if the calculation is done with consecutive RR packets, but the ratio also gives an estimate of the loss fraction if one or more RR packets have been lost, and it can show negative loss when there are duplicate packets. The advantage of the fraction lost field is that it provides loss information from a single RR packet. This is useful in very large sessions, in which the reporting interval is long enough that two RR packets may not have been received.
Discuss any discrepancies between your calculations and the loss statistics found in RR packets.

By analyzing trends in the reported statistics that are received by the sender (in RTCP Receiver Reports), determine whether any observed loss is a transient or a long-term effect.
Loss rates can be used to influence the choice of media format and error protection coding used. Check whether during your session you observed any such changes.

Draw a histogram of fractions of lost packets for each synchronization source (SSRC) reported in RTCP Receiver Report packets. (Draw a different histogram for each source.) Each bar of the histogram should show the frequency of how many times different fraction of lost packets was observed during interval length δ. Suppose, for example, that you observed that the minimum fraction of lost packets was zero and the maximum fraction was 0.1 (i.e., a maximum of 10 % of packets was lost in some intervals). Then show 10 bins on the horizontal axis of your histogram. The height of the first bin should show how frequently the loss rate between 0 and 0.01 was observed within each interval δ over the entire session. The height of the second bin should show how frequently the loss rate between 0.01 and 0.02 was observed over the entire session. And so on.

For each experimental session and different time unit δ, do the following separately for each different SSRC identifier:

Draw a timeline of where the horizontal axis shows time in δ units, and the vertical axis shows:
- The number of successfully received packets during each observation interval δ.
- The number of successfully received bytes during each interval δ
  (this is distinct from the packet view because packets may have varying sizes).
- The number of lost packets during each interval δ.
Because the magnitude of diffrenet parameters will be very different, the chart may not be legible if the linear scale is used. Try using the logarithmic scale on the vertical axis.
Draw the histogram showing how frequently different number of successfully received packets was observed during each interval δ.
Draw the histogram showing how frequently different number of received bytes was observed during each interval δ.
Draw the histogram showing how frequently different fraction of lost packets was observed during each interval δ.

Another interesting chart may show the histogram of packet sizes observed during each experimental session.
Also report the total percentage of lost packets over the entire session.

3. Report Preparation and Submission

Note: This report assumes that you used the data collected in WS Project #3 and that the characteristics of those data are already described in the report for Project #3. If instead this project was based on a new set of data, your report must first describe the statistics of the new dataset as was done in the report for WS Project #3.

As a minimum, include the following information in your report:

Describe the detailed procedure with exact formulas used for calculating delay, inter-arrival times, and delay jitter.
If you used Matlab or some other language, include your source code.
For each synchronization source (SSRC), show all graphs drawn for delay, inter-arrival times, and delay jitter.
Discuss your observations.
For each synchronization source (SSRC), show the histograms of received versus lost packets. Also report the cumulative numbers of lost packets for each SSRC identifier.
...

For all charts provide the measurement units for both horizontal and vertical axes. The units should be shown either in the chart itself or in its caption.
Discuss all your charts, particularly whether you observed and differences for different participants/computers and different observation periods (low-intensity network traffic versus the periods when the network is busy).

The report format is the same as for project 1.

Submission deadline given on the course syllabus page. (Only PDF format will be accepted!)

@ Back to Wireshark projects page
& Back to Computer Networks textbook page

Last Modified: Sun Nov 16 11:40:31 EDT 2014
Maintained by: Ivan Marsic