ABSTRACT OF THE THESIS
Object extraction is a critical operation for many content-based video applications. For these applications, a robust and precise extraction technique is required. This thesis proposes an efficient and accurate method for generating a trimap for video matting. We first segment the foreground using motion information and neighboring pixel coherence via graph cuts.
Also, we estimate the parameters of a Gaussian Mixture Model for the foreground and background with segmented foreground and estimated static background. Next, we classify the pixels of each frame into models by performing maximum likelihood classification and generate a trimap which is an image consisting of three regions: foreground, background and unknown. Finally, we use the trimap as a guide in spectral matting for video matting. Our experimental results show that the proposed method yields accurate and natural object boundaries.
FOREGROUND OBJECT SEGMENTATION VIA GRAPH CUTS
Motion information alone is insufficient to segment moving objects. As shown in Figure 2.1(b), the motion map is too sparse and inaccurate to identify the moving objects. A denser motion map is needed for accurate identification of moving objects. For the foreground segmentation, we use the graph cuts algorithm to overcome inaccurate and noisy motion measurement.
TRIMAP GENERATION USING A COLOR-LEARNING METHOD
Figures 3.1(b) and (e) show the foreground color distribution of Nana (Figure 3.1(a)) and Natan (Figure 3.1(d)) sequences, whereas Figures 3.1(c) and (f) show the corresponding background color distribution. Note that the foreground colors and background colors are separately distributed. For example, in the Nana sequence, skin color and background color seem similar, but in Lab space these two colors can be differentiated from each other.
TRIMAP-GUIDED SPECTRAL MATTING
Accurate object extraction is an important task in computer vision and video processing because it applies to image editing and video production. Many matting methods have been proposed to extract a high quality matte from video sequences. Alpha matting or digital matting was first introduced by (Porter and Duff, 1984). Figure 4.1 shows a result of alpha matting.
In Figure 5.1 we compared our result with a conventional binary segmentation method, SIOX (Friedland et al., 2006). The proposed method yields more accurate boundary of object since spectral matting is ideal for segmenting complex structures such as hair or fur. Another advantage of our method is that this algorithm doesn’t require any user interaction at all while other methods require user-defined trimap selection or special setting.
Also, it often fails in some cases; when the motion of the object is too small to obtain a reasonable motion map, where background and foreground are not distinctive. Figure 6.1 shows two examples of false extraction. In the case of Figure 6.1(a), the method failed to generate a correct trimap because colors present in the jeans are similar to those present in the background.
Figure 6.1(b) shows an appropriate trimap of a person, but the boundaries are blurry and inaccurate in the matte. If we take advantage of video information, such as motion information and temporal coherence, and then use that information as additional constraints for spectral matting, we can achieve better results. In the future, we plan to extend the proposed method to video with non-static background using global motion estimation.
Source: University of California
Author: Kyoung-Rok Lee