Multiplexing/de-multiplexing Dirac Video With AAC Audio Bit Stream
Date
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
With the inception of High Definition Television (HDTV) for broadcasting digital multimedia, enormous demand for video streaming over internet and Internet Protocol Television (IPTV) applications, the choice of a good compression scheme is vital. A good compression scheme assists in exploiting the limited storage capacity and efficient use of bandwidth required for broadcasting. Dirac [31] is a state-of-the-art video codec aimed at applications from HDTV to web streaming [1]. Dirac was developed by the British Broadcasting Corporation (BBC) and is an open technology which does not involve any licensing fees. Studies have shown that the performance of Dirac compares well to the H.264 video codec [3]. At low bitrates, the quality of video deteriorates due to distortion for the Dirac video codec, while H.264 outperforms [2]. Performance of Dirac for HD media is similar to H.264, due to absence of large and intolerable variations between the two codecs [2]. Hence, Dirac is chosen as the video codec in this thesis. The right choice of audio codec is also necessary. Advanced Audio Coding (AAC) [4] is one of the audio digital codec standards defined in Moving Picture Experts Group (MPEG-2) and MPEG-4 with a few modifications [4]. The audio sampling frequency ranges from 8 kHz - 96 kHz [5]. The performance of AAC is superior at bitrates greater than 64 Kbps and also at lower bitrates (16 Kbps), and hence it is adopted in this thesis [22]. The raw video and audio data is encoded using the Dirac video and the AAC audio codec respectively. The video and audio bit-streams obtained need to be multiplexed as a single stream in order to be transmitted over the network. The objective of this thesis is to multiplex the video and audio bit-streams for transmission, de-multiplex audio and video bit-streams at the receiver's end while maintaining lip synchronization during the playback. The MPEG-2 [19] system is adopted in this thesis to achieve the multiplexing process. The bit-streams of audio and video correspondto the respective frame data. This data is packetized as Packetized Elementary Streams (PES) which is of variable lengths. This is further packetized as Transport Stream (TS) packets of fixed length and 188 bytes long [9]. The fixed size packet length facilitates the transmission process. The timestamp information is encapsulated into the PES header in the form of frame numbers which help in achieving lip synchronization during playback. The presentation time of video and audio is used as a reference in multiplexing the audio and the video TS packets which aid in ensuring the buffer fullness (i.e. prevents buffer overflow or underflow) at the de-multiplexer end. Sequence Parameter Sets (SPS) and Picture Parameter Sets (PPS) present in the video bit-stream are also transmitted in the form of packets to assist the decoder in decoding the video data. The header information included helps in a faster and efficient demultiplexing process. The algorithm for multiplexing and de-multiplexing was implemented while maintaining lip sync during playback. Advanced Television Systems Committee - Mobile/ Handheld (ATSC - M/H) has an allocated bandwidth requirement of 19.6 Mbps [13], whereas the transport stream bitrates obtained using the multiplexing algorithm implemented for the inter coding of sequences used are 102.13 kbps and 96.72 kbps which can be easily and efficiently accommodated. Encoding video using Dirac and audio based on AAC, multiplexing the two coded bit-streams, packetization, de-multiplexing the two coded bit-streams, decoding the video (Dirac) and audio (AAC) while maintaining the lip sync are the highlights of this thesis. Advantages and limitations of the method proposed are discussed in detail.