Efficient Multi-view Video Coding Scheme Based On Dynamic Video Object Segmentation

Efficient Multi-view Video Coding Scheme Based On Dynamic Video Object Segmentation

Date

2007-08-23T01:56:06Z

Publisher

Computer Science & Engineering

Abstract

Multi-view video, which simultaneously acquires multiple video sequences from multiple viewpoints or view directions, is poised to become the next generation video technology. Exploiting redundancy is the hallmark of traditional video coding but is even more essential in multi-view video coding (MVC) where the data size is extremely large. The exploitation of additional redundancies, however, incurs extra computational overhead, thereby counteracting the benefits gained from coding efficiency. This dissertation proposes an efficient MVC scheme that provides a complete encoding solution with low complexity. This includes exploitation of inter- and intra-view redundancies for achieving high coding efficiency, and exploitation of inter-view and temporal domain coded information for lowering the coding complexity. In order to be compatible with single-view applications, a multi-view scene typically consists of at least one base view (BV) and multiple enhancement views (EVs). The proposed encoding scheme first segments the pre-encoded BV into objects and background using a proposed fast segmentation technique with low overhead. Next, it generates the initial disparity maps (DM) of each object and background as well as initial inter-view prediction frame for each EV using an object registration and warping algorithm. With this initial DM, fast EV coding can be realized by using the coded information in BV. This approach has several advantages: First, the DMs for EVs are refined based on the initial DM within small search range. Second, disparity vectors (DVs) are differentially encoded with respect to the initial DV, which leads to bit-rate savings and reduced computational complexity. Third, in addition to block-based disparity compensated prediction, one more inter-view prediction is provided, which enhances accuracy with low encoding overhead. Fourth, guided by the initial DM, the motion vectors (MVs) of the EV are predicted from the MV field of the BV, which leads to lower complexity motion estimation. Another contribution of this work is an efficient frame-based segmentation algorithm for MVC. The algorithm combines the intensity of the reconstructed frame and MVs obtained from the pre-encoded base view in MVC to segment objects in fast turnaround time. The proposed segmentation, object warping and registration methodologies, collectively provide a complete compression scheme.