Make Room For Video

Make Room For Video

On the network, some people believe that digitized video, voice, and data are all the same, but the increasing demands of video on a network deserve a closer look.Think your network is already crowded? Cisco recently published a series of white papers about the increased volume of video traffic that is likely to be seen on networks in the near future. Some of the surprising results included:

  • By 2012, nearly 90 percent of all IP traffic will be video content.
  • Already, video is one-quarter of all consumer traffic and will double every two years through 2012.
  • Video traffic has exceeded P2P file sharing traffic such as music sharing.
  • The amount of video traffic in 2012 will be over one-half of a zetabyte, which equals over 500 billion gigabytes.

If these rather amazing projections are correct, will our networks be able to support the traffic? Will new architecture be needed? What will be the impact on the data traffic that has always been business critical? The future's not that far off either. Many laptops are available with factory-installed webcams, or they can easily be purchased for about $30. YouTube, once a "social networking" novelty, is now becoming a bona fide viral marketing vehicle for entrepreneurs and established businesses alike. ComScore, a Reston, VA-based internet measurement company, said internet users in the U.S. viewed 13.5 billion videos online during the month of October 2008.IT directors have been expressing their concerns about video for some time. Until recently, enterprise users haven't demanded that video be carried on the network. But applications

like videoconferencing, training, sales demos and video pitches, and security are becoming mission critical. It's becoming increasingly important to consider the implications of storing and transmitting all of this video.

Video comes in many formats, and there are many ways to categorize the different types. For the purposes of this discussion, it's helpful to distinguish among these three types of video content.

Type 1: Conventional video can be analog, digital, or digital IP. It is characterized by the fact that it is recorded and presented by the standards and traditional methods of the television broadcast industry. It emanates from a single source and is broadcast to many users. In analog form, one television channel occupies 6 MHz of bandwidth. At output it will conform to the NTSC or ATSC standards used by televisions. Examples include television programming as delivered by providers such as Comcast, Verizon FIOS, AT&T U-verse, or Time Warner.

Type 2: Videoconferencing (VC) is usually two-way point to point, although it is occasionally a multipoint mesh network. All parties can see each other. Historically, it has used telco circuits such as T-1 or T-3 circuits. More recently, VC is embracing IP and MPEG compression using the H.264 standard. Videoconferencing has traditionally used large specialized endpoints. More recently, VC can be accomplished from the desktop using a conventional PC.

of packets that are provided by TCP.

According to Cisco and other experts, these forms of video are slowly merging together. The future will see nearly all forms of video as MPEG compressed content transported in IP packets.

Because conventional video and VC do not use TCP, they have rather predictable behavior on the network. For example, conventional video carried in IP packets often has a rather precise bandwidth requirement. On the other hand, internet streamed video will download much like a data file transfer and will take as much bandwidth as TCP can negotiate to obtain from the network. Its bandwidth requirement is less predictable.

Table 1 shows some typical bandwidth usage with the various methods of transporting video. Conventional video currently uses 2 to 4 Mbps for standard definition video. Most often it is sent at a fixed bit rate. First, the encoder creates the MPEG stream of bits. Then they are deposited in 188- byte MPEG packets, seven of which are placed in each IP packet. The bit rate is set in the encoder, and that determines how frequently the IP packets are sent. But, since they are uniform in size, the frequency doesn't vary much.

an algorithm that combines the amount of motion the camera sees and the capacity of the TCP packet. Therefore, the traffic load increases and decreases based primarily on the motion in the scene.

Most internet-streamed content is delivered with TCP using the proprietary method invoked by the server. Examples include streams sent to Windows Media Player or to Adobe Flash Player. The server determines what type of video to stream, based on quality levels negotiated or preset by the player. The receiving player usually selects the bandwidth requirement and therefore determines the quality of the video transmitted. Over time, this technology is evolving to H.264 compression, which allows a wide variety of bandwidth/quality combinations. The stream can be created in as little as 64 kbps or use as much as 27 Mbps. Another important feature with this form of video is whether it is buffered and played, or stored and played at a later time. While there is a subtle difference, both involve downloading the video file to memory and playing it out. If the play out begins immediately, then the stream must be delivered in a smooth manner with enough throughput to keep the player busy. If it is to be played later, the transfer is similar to a conventional data file transfer and can use any reasonable period of time to be delivered.

Conventional video transported across an IP network allows very little network jitter ("burstiness" causing variable delay). It also requires uniform bandwidth. Videoconferencing must have low network latency, but has moderate tolerance for network jitter. Internet streaming video has a high tolerance for network jitter and will accept some latency in the network. It also has highly variable bandwidth requirements. The bottom line: no one network architecture fits all of these requirements.

is probably unnecessary. On the other hand, if the network is to carry primarily conventional video, separating the video traffic from the data traffic is mandatory. DHCP, ARP, and browsing (server browsing, not web browsing) are based on layer two broadcasts. This broadcast traffic is also very bursty in nature and is processed by every device on the network. So, broadcast traffic is disruptive to video streams that are being transmitted. In this case, a separate VLAN will be very helpful if the switches have the bandwidth to pass both the data and the video without introducing latency. Most modern switches are capable of doing this.

Since videoconferencing is usually done across significant distances, WAN circuits are usually involved. The LAN part of the network won't normally present a problem. However, on the WAN circuits, you'll need to implement separate logical circuits or use VLANs. For best results, keep internet streamed video and data transfers separated from videoconference traffic.

Internet streaming video can generally be treated like data. They are very much like large file transfers. Your biggest concern may be the total consumption of bandwidth. So, if it is possible, find out how the video can be throttled at the server. Generally, video servers will allow limiting the video to some fixed bandwidth, such as 256 kbps. It affects the creation of the video by adjusting the compression algorithm, frame rate, and resolution. If throttling at the server isn't possible, place rate-limiting devices (or "traffic shapers") in the path of the video. Most of these look at some combination of IP address, port number, and URL to determine the type of video and limit all traffic matching those identifying characteristics. Most of these devices allow traffic limiting based on a perapplication basis or a per-user basis. As long as the traffic can be identified, the traffic shaper will enforce a maximum bandwidth limitation.

Some network files can be very large. For example, if a feature film is encoded as MPEG-2 in the same format that it might be transmitted over a network, it will need 20 to 30 GB of storage. It will also need to be read from the storage device at 2 to 6 Mbps. While modern disk drives and disk interfaces are normally rated higher than this, the ratings assume that the information on the disk isn't fragmented. If the server has had a lot of reads and writes, the file to be sent may be highly fragmented, slowing the rate at which it can be presented to the encoder that creates the IP packets. Of course, this process is very dependent on the system processor. An older server with limited memory will likely create a significant bottleneck to the smooth flow of video.

Video is unlike audio (voice) or data. Data traffic is highly sporadic. Voice traffic is low bandwidth and packets are sent in a very regular pattern. The way video behaves on the network depends on the type of video you're transporting. You will need to build your network's architecture around the type of video you intend to deliver. If it's conventional video destined to play out on a television, you'll need a network with dependable bandwidth and very little jitter. Videoconferencing and streamed internet content will be much easier to transport. Careful study of the type of video you will transport, combined with care in designing and maintaining your network, is critical to high-quality video presentation.

Phil Hippensteel is an industry consultant and assistant professor at Penn State University. He can be reached at Got a comment? Contact us at, and type the title of this article into the subject header.