Low Complexity AVS-M Using Machine Learning Algorithm C4.5

Date

2011-07-14

Authors

Journal Title

Journal ISSN

Volume Title

Publisher

Electrical Engineering

Abstract

AVS China is the latest video coding standard developed by the AVS work group of China employing the latest video coding tools for achieving high compression while preserving the quality of the video [15]. AVS China video standard has been proven superior to the existing video standards such as MPEG-2 and MPEG-4 part 2, in terms of reduced complexity and coding efficiency. AVS is a set of integrated standard system which contains systems, video, audio and media copyright management. AVS-M, the 7th part of the system is video codec targeting mobile devices. AVS-M uses high efficiency tools like macroblock portioning, entropy coding and motion estimation and motion compensation to exploit the temporal and spatial redundancies present inside the video sequence. This helps to save the number of bits used to encode the video sequence, in turn saving bandwidth of video transmission. But all these tools come at the cost of computational complexity. Motion estimation alone, consumes 75%-80% of the encoding time. With all these tools, AVS-M, simulated on a general purpose processor, can process upto 2 frames per second. Live streaming on the other hand will demand at least 15 frames per second. The huge gap between the power demanded and available power can be bridged by embedding special purpose hardware like FPGA (field programmable gate array) or ASICS (application specific integrated circuits). But these solutions are very expensive and also create problems like consuming lot of power and overheating problems in mobile devices. One possible way is to reduce the complexity of the encoder. Various studies have been undertaken, to implement fast motion estimation algorithms in place of regular motion estimation block. This thesis tries to implement machine learning algorithm, to predict the macroblock mode decision for encoding. In the first step to achieve the desired result, first of all many attributes of frames are extracted, and then fed to an .arff (attribute relation file format) file. .arff file is then used as input to a Weka tool. Weka is embodiment of various data mining algorithms. Using C4.5 of Weka, a decision tree in terms of if-else statements is obtained. The decision tree obtained here, traces back to maximum of 7 if-else statement executions, to decide the macroblock mode, which compared to the RDO (rate distortion optimization) process, is very less. Hence large amounts of time saving are expected, with proposed encoder. As expected the proposed encoder achieved on an average 75%- 80% reduction in encoding time, compared to the original encoder AVS-M.

Description

Keywords

Citation