Understanding the Technology Behind Live Streaming Services

The prevailing technology behind live streaming services is PC-based Internet video, a concept that evolved between the mid-1990s and 2000s. In this time period, video conferencing was highly popular, and as a result of this, a good deal of video compression methods were developed. These were aimed at accomplishing a good quality video conferencing system (the implications of these methods were geared towards encoding a stream of relatively static images).

Live streaming, in general, pertains to the method in which a series of compressed multimedia data (video and audio) can be played back and viewed in real time. In order to achieve this, there is usually a type of client application (in this case, a flash-based video) that can be played back by an end user while being delivered over a network. Live streams are different from pre-recorded media because with live streams, the information is constantly being transmitted, while with pre-recorded media, the user has a chance to download the media file. Due to this assumption, we are looking at live streaming technologies.

We are today witness to the surge of the internet. The recent boom in technology has enabled the internet to become a part of our daily lives. No longer are we confined to sending simple emails; instead, we have interfaces which enable us to talk to our friends through microphones and headsets, while visually seeing them through webcams. This exchange of information has already evolved past this stage and is constantly progressing. One of the most recent developments of transferring information has been the ability to broadcast live events over the internet. This means that an event hosted in one country can be broadcasted to another part of the world. An example of this might be the E3 convention. Although it is hosted in America, it is possible to watch the event live in countries such as Japan. This method is far cheaper than trying to set up a TV broadcast to host the event in other countries. Due to its lower cost and relative simplicity, many individuals and organizations are exploring the possibility of setting up their own live streams for various events.

Streaming Protocols

The other model of transfer involves downloading the file directly to create a copy on a user’s hard drive. Once the file has been downloaded, it can be opened and played at the full speed offered by the drive. This kind of transfer is what many new internet users believe to be the fastest and best model. However, the evolution of broadband internet connections has caused streaming to greatly increase in quality and convenience.

Streaming protocols support playback which includes a continuous transfer of data, allowing a user to play the media file as it is being sent. When compared to standard downloading, a file can be opened and played using the standard download. It can be stopped, and then the data can be opened again and played without the need to download the file again. This kind of transfer is called progressive download.

Stream protocols are the method of transmitting data across the internet. They use various kinds of protocols to be used on certain types of data. Streaming media generally refers to the situation where a user is able to watch a video or listen to audio content before the complete file has been transmitted. Before streaming technology, a user had to download the whole file before playing it.

HTTP Live Streaming (HLS)

HTTP Live Streaming (HLS) is an adaptive streaming communications protocol developed by Apple. The word “adaptive” indicates that video can be encoded at many different bit rates, and the client is able to switch between them as it plays. The stream is divided into small 10-second “chunks”. A client starts by downloading an m3u8 playlist file, which is a list of chunks. It then downloads each chunk using an HTTP GET request. Each chunk has a unique address, which allows it to be referenced independently of the playlist. This makes it easy to switch to a different chunk rate, as the client simply needs to select a different playlist that references the same chunks. The video chunks are downloaded using “HTTP 1.1 GET” requests. This is the same method as downloading a webpage, which makes it very easy to deploy HLS video on any existing HTTP server without any extra software. With 1.1 GET, it is also possible to implement proxy servers that can cache the chunks and distribute the video to clients in a content delivery network. This makes HLS very scalable and cost-efficient.

Dynamic Adaptive Streaming over HTTP (DASH)

Dynamic Adaptive Streaming over HTTP (DASH) is a newer technique emerging in the field of adaptive bitrate streaming to provide the highest possible quality of streaming experience to the user. As it goes beyond what other ABR techniques have offered, the focus of this paper is to discuss the technology and the techniques behind DASH that make it superior to existing ABR methods. DASH competes with HLS for providing the best quality video streaming and is designed to be a direct replacement for the varied HLS implementations. As we have seen in the literature review, HLS has effective ABR streaming techniques that break up video content into small fragments and enable fast channel switching between streams. This is also true of DASH, which has equivalent functionality to HLS. However, the main advantage of DASH is that it is an open and standardized technology, as voted on by the international standardization organization in 2012. This creates a great advantage for DASH, as the use of an MPEG standard means that all video codec processing and streaming techniques will be specified in detail, and quality methods to access and process video content shall improve over time. This is quite different from HLS, as current specifications of codecs and streaming methods are not fixed, and Apple can change video processing guidelines and private streaming methods without notification or detail to third-party developers who need to implement these methods. Current changes to HLS specs upon Apple’s release of the iPhone 5 (MPEG4 relocation to audio-only streams) have caused problems for developers, as actual guidelines to implement changes have not been given. With an open video content standard in DASH, this shall not occur, and DASH has the potential to have greater respected video quality guidelines than HLS for future internet video content. This is further helped by the fact that DASH has equivalent MPEG video coding support to all on-demand video services and with added functionality of MPEG DASH described live streaming services (a topic yet to be researched).

Encoding and Transcoding

I-frame codecs such as Motion JPEG or the more recent JPEG 2000 codec are much simpler, where each individual frame is a JPEG image. With I-frame codecs, it is easier to change video bitrate and resolution by discarding frames. However, the tradeoff is poor performance with low bitrate media streaming, as video frame data is not optimized for it.

Video codecs can be categorized as either an I-frame codec or block-based transformation codec. The more recent H.264 codec is an example of a block-based transformation codec, where each frame is divided into a number of macroblocks, dictated by motion vectors from previous frames to minimize data redundancy from frame to frame. Block-based transformation codecs essentially optimize video data storage by selecting the optimum way to express picture information.

Today’s live video encoding and transcoding technology has its roots in broadcast TV, with the digital transition of the early 2000s being the key turning point for most broadcasters. In those early days, digital video was encoded to MPEG-2 and later H.264 for the vast majority of internet-based distribution. This led to a relatively straightforward transcoding process where media was encoded to a single bitrate and resolution, then transcoded to a lower quality stream by reducing the resolution and bitrate of the source.

Video Codecs

Compressing video data can result in a large amount of data reduction for a given quality, but it can also be very computationally complex requiring a lot of CPU power. Live video often does not have the luxury of video on demand services to compress a video for several hours or days and it is likely that the compressed video to be viewed will have to be done in real-time. High compression ratios and real-time constraints have resulted in codecs specifically designed for live streaming. These codecs may utilize lossy compression and discard information that is deemed unnoticeable to the viewer. Live streaming is also commonly related to video over IP and network-friendly codec designs are able to minimize the effect of packet loss on the decoded image.

Video codecs provide effective methods for compressing video while preserving as much quality as possible. Codecs encode and decode video, using the same codecs simplify the data. A codec is made up of two parts: an encoder and a decoder. The encoder prepares the data for storage or transmission, the decoder reproduces the data for viewing or editing. By encoding video information, the codec can reduce the amount of transmitted data, increasing quality on the receiving end. The decoded image is not usually the same as the encoded image, but is very similar. The more data that is used, the closer the decoded image will be to the original pre-encoded image. This means quality is directly proportional to the amount of data used. Codecs can use techniques varying in complexity to achieve different levels of compression.

Audio Codecs

Now let us turn our attention to the other component of what’s today considered a “stream,” i.e., audio. At first, audio was delivered by modulating the frequency of an analog carrier signal and then applying bandlimiting filters, enabling it to be carried over the same channels as the television’s (in the case of TV) visual content. This method relied on the principle that audio is less sensitive to data loss than visuals. However, it was displaced by separate transmission systems for audio and later digital audio delivered on CD-ROM. Because audio can enhance the experience of a live event or greatly compensate for loss of quality in the visual content, many stream providers choose to accompany the video with a high-quality stereo audio track. Unlike the choice of codec for video, the variety of audio codecs which can be used for a given audio format is quite large, and when choosing an audio codec, the same tradeoffs between compression and quality must be considered. The newest audio codecs, such as AAC, claim to achieve transparency at lower bitrates than MP3 and also have improved multi-channel support. As with video, a lossless codec could also be used to ensure that the audio after decompression is identical to the original. Live streams with audio content only, though simpler in nature, can also benefit from the same methods of transcoding discussed here.

Transcoding Techniques

Transcoding is the process of taking media in a certain format and changing it to a different format that will be more suitable for a particular application or device. For live streaming, it will be taking the recorded media and changing it to a format that can be utilized for the RTMP server/client communication. Although transcoding can be a very CPU intensive process, by saving multiple copies of the media in various qualities, it allows for more versatility later on down the line when integrating with the CDNs. When live streaming, it is best to apply on-the-fly transcoding, as this is the most efficient use of resources. Applying effects, filters, user actions, and adding in extra metadata are all part of on-the-fly transcoding, and using it in conjunction with the recorded media transcoding makes it well-suited for real-time interactive applications like gaming live streams. This is because you do not have to constantly re-transcode the entire stream every time you make a change.

Content Delivery Networks (CDNs)

A CDN is a collection of web servers distributed across multiple locations to deliver content more efficiently to users. When a user requests content that is part of a CDN, the request is redirected to the server that is geographically closest to that user, and the content is delivered from there. The benefits of using a CDN include offloading origin servers, improving page load times, reducing bandwidth consumption, and increasing content availability. CDN servers also provide benefits to ISPs, including reducing peering traffic, improved reliability, and improved latency. A CDN can be self-operated or outsourced to a service provider that will provide partial or complete handling of the CDN. A large CDN that serves many customers and has servers located throughout the world is called a tier 1 (or transit) provider.

A CDN is a large distributed system of servers deployed in multiple data centers across the Internet. The goal of a CDN is to serve content to end-users with high availability and high performance. CDNs serve a large fraction of the Internet content today, including web objects (text, graphics, and scripts), downloadable objects (media files, software, documents), applications (e-commerce, portals), live streaming media, on-demand streaming media, and social networks.

CDN Basics

A CDN is an advanced and distributed system of servers which delivers web content to a consumer based on the geographic locations of the consumer, the origin of the web page, and a content delivery server. A large-scale CDN will have a lot of server nodes, and servers are typically organized into a 2-layer hierarchy. The upper layer is composed of surrogate origin servers, which store the original content. The lower layer is composed of a distributed system of cache servers, which store copies of the content retrieved from the origin server. An edge server is a server that exists in the edge network of the CDN, which is the closest point to the end user. The architecture dictates that when a client makes a request for content, that request is routed to the best possible server for delivery to that client, and the content is delivered by the server with the best quality.

When a consumer follows a web link to access multimedia content, the consumer’s client often makes a request to the content owner’s origin server to access the content. If the content is popular, there may be a lot of demand for the data. If the network path from the origin server to the client traverses a congested area or long delay path, the quality of the delivered content for a real-time application may degrade to an unacceptable level. The mission of a Content Delivery Network is to deliver the requested data to the client with the best possible quality. Since a big fraction of today’s internet traffic is in the form of streaming audio and video content, it is only natural that CDNs are now being used to deliver this type of content.

CDN Architecture

Originally CDNs were implemented predominantly using private peering and private networks. In a private peering CDN, content is delivered to users using IP transit from the web content provider’s origin, to the CDN server, to the user’s ISP, and then to the end user. This method proved effective but had no guarantee that the content takes an optimal route from the server to the end user, and no control over the end-to-end quality of service. In recent years, the focus has shifted toward public peering CDNs, which are implemented using overlay networks and edge servers. Public peering CDNs provide more control over the end-to-end quality of service. This is done by having the CDN server pass the content to an end user’s ISP at an Internet exchange point. Then using route control and anycast, the content is ensured to take an optimal route from the server to the end user. This method has proved highly effective and is the most current trend in CDN implementation.

A CDN is a large-scale distributed system of servers deployed in multiple data centers across the Internet. The goal of a CDN is to serve content to end-users with high availability and high performance. CDNs serve a large fraction of the Internet content today, including web objects (text, graphics, and scripts), downloadable objects (media files, software, documents), applications (e-commerce, portals), live streaming media, on-demand streaming media, and social networks. High availability and high performance are the two most important goals of any CDN. High availability is achieved by redundancy. Redundancy in different forms is present at the DNS, request routing, load balancing, and server levels. This creates a fail-over environment such that if any entity fails, another can take its place. Often this is seamless to the end-user. High performance is achieved by locating servers at the edge of the Internet. By having content located close to end users, information is retrieved more quickly.

CDN Providers

In recent years, there are many companies around the world that provide CDN services to these broadcasters, and typically these CDN providers would build servers around the world to form a network in order to serve their customers. Though implementing a network is costly, having one is necessary for quality live streaming as implementing a network would allow the video packets to travel a shorter distance and therefore reduce their latency to their destination. CDNetworks, an Asian-based company, has built a network in over 70 countries. According to a case study, Riot Games, who had chosen CDNetworks, was successfully able to stream their League of Legends match in HD to Taiwan with low latency. This would be the best-case scenario for broadcasters as it results in providing viewers with a better visual experience.

Over the years, the popularity of streaming live videos has increased tremendously. It has become easier for providers such as home users, businesses, and even gamers to start streaming and engage with their intended viewers. In an article from September 2016, the author stated that “a business and an individual streamer can deploy live video via CDN networks in 1080p resolution at 60 frames per second. The high-end provider will let them stream in HD quality video over a 4G mobile connection.” With the ever-increasing demand for HD and higher resolution videos, CDN providers will always try to improve their technology to be able to support these demands.

Related Articles

Leave a Reply

Back to top button