Note: Examples of student project reports will be made available to course instructors upon request ( send email ).
The goal of this project is to capture and analyze RTP and RTCP packets during a real-time conference session over a wired and wireless network.
Where to find information about RTP/RTCP:
This YouTube video explains how to decode RTP packets. It is not in English, but you can see how the author decodes the packets.
The first step is to install a video conferencing software on computers
in your project team. The most important requirements are that this
software is based on RTP/RTCP and does not encrypt the packet payload.
and Apple Facetime
are the most popular choices, but
Hangouts encrypts information
and Skype does as well, as described in the
TLS and sRTP for Skype Connect Technical Datasheet.
It appears that it is not possible to selectively disable the encryption feature.
Facetime appears not to be using payload encryption as of the time of
this writing (October 2014), but it runs only on Apple computers.
An alternative is to search for an open source video conferencing software that does not do payload encryption. This Wikipedia page lists the features of various web conferencing software.
It appears that Linphone meets well our requiremets, so you may try using Linphone.
You may still run Hangouts or Skype for comparison, because they provide mature and widely used platforms, although you will not be able to accomplish all requirements of this project by using only Hangouts or Skype.
Establish a conferencing session over a wireless/Wi-Fi LAN
(activate both the audio and video options).
Each participant should be in a different geographic location, or at least should try to connect to a different wireless LAN while conferencing. Conference for about 5 – 10 min; longer durations will ensure more meaningful statistics, which is particularly important because these data will be used in WS Project 4.
At the same time all participants should use Wireshark to capture all the IP packets sent from their host and received from other host(s).
For example, knowing that the IP address of your host is 192.168.2.11, you could use these Wireshark filters:
Because of the complexity of this project, you may wish to first run some preliminary experiments to establish the methods for performing the actual experiments:
Describe these experiments in your report before you describe your actual experiments. Call this section “Methods” and describe the actual experiments in the “Experiments” section. It makes sense to describe the methods first, not the last.
Perform your experiment at least two times, once during a low-intensity network traffic, and once during a busy period, when you expect that many people will be using the same wireless network. (In your report describe the locations of all participants, which Wi-Fi networks they used, and during which times of day.) Note that we are assuming that the wireless link is the “bottleneck.&rdqo; However, this assumption may need to be tested if you are using Linphone, for which the server is located overseas (in France).
After capturing the packets, use Wireshark filters to partition the traffic to/from
your computer so they can be analyzed separately.
The data preprocessing consists of the following steps:
The first step is to separate the RTP data packets from RTCP control packets.
If you are experimenting with Google Hangouts, their
“The UDP traffic consists of STUN, RTP, and RTCP packets,
with SRTP encrypted data payloads.”
Because the payloads are encrypted using
you will not be able to perform the analysis required for this project.
Note that the conference participants will not be directly connected to each other, but indirectly via the conference servers of your conferencing software (as you may have already discovered using traceroute).
It is not possible to recognize RTCP packets only based on their header.
You must infer RTCP based on the UDP port—the UDP port(s) with
majority packets are RTP data sessions.
To separate the RTP data packets from RTCP control packets, use
the fact that they are usually transmitted on different ports.
RFC-3550 in Section 11: “RTP over Network and Transport Protocols” describes some guidelines on demultiplexing of RTP data and RTCP control streams. It says that:
For UDP and similar protocols, RTP SHOULD use an even destination port number and the corresponding RTCP stream SHOULD use the next higher (odd) destination port number. ... For applications in which the RTP and RTCP destination port numbers are specified via explicit, separate parameters (using a signaling protocol or other means), the application MAY disregard the restrictions that the port numbers be even/odd and consecutive although the use of an even/odd port pair is still encouraged.However, a particular application may implement something different from the recommended port assignment.
First list the number of packets that one participant’s computer received
on different UDP ports (by using the Wireshark display filter udp.srcport).
For example, you may observe the distribution of packets on different UDP ports as shown in Table I.
|Source Port Number||Number of Packets||Packet Type||SSRC||Total Number
of Lost Packets
|Mean Time Between|
|103 (audio data)|
99 (video data)
|16385||47||200 (RTCP SR)|| 1562797448 (reporter) |
3459438840 (first source)
489794638 (second source)
|26||201 (RTCP RR)|| 1172311862 (reporter) |
3459438840 (first source)
489794638 (second source)
|7||202 (RTCP SDES)||…||…||…|
Note that if you are capturing only RTCP Sender Report (SR) packets but none of Receiver Report (RR) packets, this may be because receiver reports are piggybacked on SR packets (see the SR packet format in Section 6.4.1 of RFC-3550). This is usually the case when the reporter host is both source and receiver at the same time. In other words, this participant is both sending audio/video to other participants and receiving their audio/video.
The remaining columns of Table I are described in the following subsections.
Also show the distribution of packets from the first participant to the second participant (Table II).
|Destination Port Number||Number of Packets||Packet Type||SSRC||Total Number
of Lost Packets
|Mean Time Between|
|103 (audio data)|
99 (video data)
|19303||31||200 (RTCP SR)|| 3459438840 (reporter) |
1562797448 (first source)
1172311862 (second source)
You should show at least four tables in your report. For example, if
your conferencing session had two participants, show separate tables
for incoming and outgoing packets for each participant.
For each table, indicate the host on which the packets were captured and the direction (Inbound versus Outbound).
Note that the number and properties of RTP packets sent by one participant may not exactly correspond to those of packets received by the other participant. First, some packets may be lost in transmission (recall that UDP is an unreliable protocol). Second, the Google server over which the session is run may transcode audio or video from one compression format to another. As a result, more (smaller) packets or fewer (larger) packets may be received by the receiver than what the sender sent. Also, some of the packets’ parameters (such as SSRC identifiers) may be changed.
Note also that in addition to RTP data packets (audio or video), you may capture “RTP event packets” (shown in Wireshark as “RTP EVE”). These events support telephony-related signaling during the session, such as initiation of ringing tones. Check RFC-4733 for RTP payload format for named telephone events. List them in your tables, but do not mix them up with RTP data packets.
The second step is to determine the encoding schemes used to create the
packet’s payload for audio/video RTP streams.
For example, G.711 is an audio codec.
RTP streams can only carry media from a single source. According to RFC-3550 in Section 2.2, if both audio and video media are used in a conference, they should be transmitted as separate RTP sessions. That is, separate RTP and RTCP packets should be transmitted for each medium using a different UDP-port pair for each.
Again, a particular application may instead implement multiple media streams over the same UDP-port pair.
Using Wireshark packet inspection, determine the codecs used in RTP stream
and write them down in your report. The RTP header has a
7-bit field named “payload type”,
which indicates the specific encoding scheme used to create the
packet’s payload. For example, in Google Talk types 99 and 126 represent video,
and types 103 and 105 represent audio. See more details in
Google Talk Call Signaling.
Google Hangouts uses dynamic load type range from 96 to 127 and does not specify which packet type is used for audio or video. One approach is to guess the RTP packet type (audio/video) based on the packet length.
A more accurate approach for finding of the packet payload type may be by turning the video option ON or OFF and seeing which RTP packets are missing or reappear. Similarly, you can “mute” audio and check which RTP packets are missing.
Perhaps the best way is to apply both approaches and see whether large-sized packets disappear when you turn off the video.
For each synchronization source or SSRC i (described in the next section)
determine the sampling rate τSSRC_i.
You can read the sampling rate of the used audio and video codecs directly from the SIP invite message.
Alternatively, the sampling rate can also found experimentally by using the following calculation on a pcap of sent packets (i.e., departing packets captured at their source, on their sending side). For each pair i, i+1 of subsequent packets, given their RTP timestamps tRTP_i (in header), and their departure wireshark timestamps tws_i, calculate the instantaneous sampling rate τi as:
τi+1 = (tRTP_i+1 – tRTP_i) / (tws_i+1 – tws_i) The approximate sampling rate τ is calculated by averaging the instantaneous values τi over a long interval, say 5 minutes. For example, you may calculate 48.12 kHz and 90.22 kHz instead of the actual values 48 kHz and 90 kHz, because of varying delays in the packet transmission process.
Note that the timestamps found in the RTP header must be converted to from random-based numbers to actual time as described in WS Project #4.
The third step is to determine the synchronization source (SSRC)
identifiers for RTP packets. A synchronization source (SSRC)
is source of a stream of RTP data packets, such as a microphone or a camera.
Each source must be identified as a different SSRC.
All packets from a synchronization source belong to the same
timing and sequence number space, so a receiver groups packets by their
synchronization source for playback.
A synchronization source may change its data format, e.g., audio encoding, over time. (See more details in Section 3 of RFC-3550, on [Page 9].)
Note that when showing the SSRC identifiers in your report (such as in Table I and Table II), the SSRC identifiers must be aligned with the packet types, so that it is clear which source generated which type of packets.
All receivers of RTP data packets issue reports about reception quality by
sending RTCP report packets to senders of RTP data packets.
If a receiver is also a source of RP data packets, then it generates sender reports (SR).
According to RFC-3550 (Section 6.4), the only difference between a sender report (SR) and a receiver report (RR), other than the packet type code (“200” versus “201”), is that the sender report includes a 20-byte sender information section for use by active senders. The rest of a sender report is exactly the same as a receiver report.
Note that if a source is only generating RTP data packets and sending them to receivers, but does not receive any RTP data packets from other sources, then sender report packets from this source do not contain receiver report blocks.
For explanation of the meaning of different types of SSRC identifiers in RTCP
packets (“reporter,” “first source,”
“second source,” …, “n-th source”), check the textbook (Section 3.4) and RFC-3550
Here is a brief explanation:
There are two main aspects of identifying SSRC identifiers in RTCP packets:
When displaying the statistics of RTCP packets in tables, ensure
that entries in all rows are properly aligned so that the
correspondences between port numbers, payload types, and SSRC’s are clear.
Also, it must be clear which “reporter” SSRC identifier appears with which “i-th source” SSRC identifier.
Compare the empirically observed fractions of packets of different type with the expected fractions calculated by the theoretical algorithm for computing the RTCP reporting interval (see Section 6.3.1 and Appendix A.7 of RFC-3550,  as well as end of Section 3.4.2 in the textbook). The empirical fractions can be obtained from the statistics in Table I and Table II. If you cannot calculate the exact fractions, find the best approximation.
NOTE: Keep the Wireshark data collected in this project because they will be used in WS Project 4.
As a minimum, include the following information in your report:
The items listed above form just a minimum requirement for the report and can be satisfied to a different degree. Only the students who have performed greatest number of experiments and provided most extensive analysis and discussion of their results shall receive the maximum score (100%).
The report format is the same as for project 1.
Submission deadline given on the course syllabus page. (Only PDF format will be accepted!)
Back to Wireshark projects page
& Back to Computer Networks textbook page
Last Modified: Wed Oct 8 23:49:30 EDT 2014 Maintained by: Ivan Marsic