Ask Professor Phil: Does RTP Matter for Network Video?

Ask Professor Phil: Does RTP Matter for Network Video?

Dear Professor Phil,
We use IPTV in our school to deliver video obtained from feed provided by our cable company. A technician from that company and I began discussing the format of the transmitted information; he told me that they don’t use RTP (Real Time Protocol) in the core of their network like we do in our school. He said it just adds unnecessary overhead. I thought the RTP carried the timing information. Is he right? Is it possible to eliminate the use of the RTP?
—Ben, Chicago, IL

  • Ben,
  • Yes, it is. I assume you are talking about the use of the mpeg transport structure which is common in nearly all IPTV deployments. Here’s a quick review of that mechanism.

In the simplest case, three inputs feed the transmitted stream: audio, video and stream metadata. Figure 1 shows this. Each input is segmented into 184 byte blocks to which a four byte header is attached. In the case of audio and video, this header contains

• a byte to indicate the beginning of the header
• a PID (program ID)
• a continuity counter
• an optional adaption field


Figure 1: The MPEG Transport Structure
The program ID binds the audio and video to each other and shows their relative position in a stream that may represent many channels. The continuity counter is simply a number that sequentially increments from 0 to 15 and then repeats. This will be of some help if the packets get out of order. It’s the adaption header that contains the time stamp called the PCR (program clock reference). This optional field must be transmitted at least ten times per second. Keeping in mind that it will usually take thousands of 184 mpeg blocks to decode a single frame representing 1/30 of a second, the PCR adds very little overhead. The other information carried within the adaption field is the PAT (program allocation table) and the PMT (program map table). These are lists of the current programs and corresponding PIDs. That is, the PAT contains a table saying program x is on PID nnn, program y is on PID mmm, and so forth. Then the decoder looks for the corresponding program map table and that PMT shows how the audio and video are identified. As an example, Channel 2 from the cable company might be listed in the PAT as using PID 00800. Typically, the corresponding PMT might show the video to be using PID 00801, left channel audio using PID 00802 and right channel audio using PID 00803.

Phil Hippensteel, PhD, teaches at Penn State Harrisburg and regularly contributes to AV Technology. Email your tech questions to pjh15@psu.edu.