Learning-based Trimap Generation for Video Matting

Get this Project:

Fields with * are mandatory

ABSTRACT OF THE THESIS

Object extraction is a critical operation for many content-based video applications. For these applications, a robust and precise extraction technique is required. This thesis proposes an efficient and accurate method for generating a trimap for video matting. We first segment the foreground using motion information and neighboring pixel coherence via graph cuts.

Also, we estimate the parameters of a Gaussian Mixture Model for the foreground and background with segmented foreground and estimated static background. Next, we classify the pixels of each frame into models by performing maximum likelihood classification and generate a trimap which is an image consisting of three regions: foreground, background and unknown. Finally, we use the trimap as a guide in spectral matting for video matting. Our experimental results show that the proposed method yields accurate and natural object boundaries.

FOREGROUND OBJECT SEGMENTATION VIA GRAPH CUTS

Figure 2.1. Foreground segmentation via graph cuts (Nana sequence). (a) original image; (b) motion map; and, (c) binary map of segmented foreground

Figure 2.1. Foreground segmentation via graph cuts (Nana sequence). (a) original image; (b) motion map; and, (c) binary map of segmented foreground

Motion information alone is insufficient to segment moving objects. As shown in Figure 2.1(b), the motion map is too sparse and inaccurate to identify the moving objects. A denser motion map is needed for accurate identification of moving objects. For the foreground segmentation, we use the graph cuts algorithm to overcome inaccurate and noisy motion measurement.

TRIMAP GENERATION USING A COLOR-LEARNING METHOD

Figure 3.1. Foreground and background color distributions (Nana and Natan sequence)

Figure 3.1. Foreground and background color distributions (Nana and Natan sequence)

Figures 3.1(b) and (e) show the foreground color distribution of Nana (Figure 3.1(a)) and Natan (Figure 3.1(d)) sequences, whereas Figures 3.1(c) and (f) show the corresponding background color distribution. Note that the foreground colors and background colors are separately distributed. For example, in the Nana sequence, skin color and background color seem similar, but in Lab space these two colors can be differentiated from each other.

TRIMAP-GUIDED SPECTRAL MATTING

Figure 4.1. Alpha matting method. (a) the original image; (b) the alpha matte; and, (c) the composite image and zoomed-in area of the image

Figure 4.1. Alpha matting method. (a) the original image; (b) the alpha matte; and, (c) the composite image and zoomed-in area of the image

Accurate object extraction is an important task in computer vision and video processing because it applies to image editing and video production. Many matting methods have been proposed to extract a high quality matte from video sequences. Alpha matting or digital matting was first introduced by (Porter and Duff, 1984). Figure 4.1 shows a result of alpha matting.

RESULTS

Figure 5.1. Comparison between proposed method and SIOX

Figure 5.1. Comparison between proposed method and SIOX

In Figure 5.1 we compared our result with a conventional binary segmentation method, SIOX (Friedland et al., 2006). The proposed method yields more accurate boundary of object since spectral matting is ideal for segmenting complex structures such as hair or fur. Another advantage of our method is that this algorithm doesn’t require any user interaction at all while other methods require user-defined trimap selection or special setting.

CONCLUSION

Figure 6.1. Limitations. (a) original frame, trimap, and alpha matte of Natan1 sequence; and, (b) original frame, trimap, and alpha matte of Natan2 sequence

Figure 6.1. Limitations. (a) original frame, trimap, and alpha matte of Natan1 sequence; and, (b) original frame, trimap, and alpha matte of Natan2 sequence

Also, it often fails in some cases; when the motion of the object is too small to obtain a reasonable motion map, where background and foreground are not distinctive. Figure 6.1 shows two examples of false extraction. In the case of Figure 6.1(a), the method failed to generate a correct trimap because colors present in the jeans are similar to those present in the background.

Figure 6.1(b) shows an appropriate trimap of a person, but the boundaries are blurry and inaccurate in the matte. If we take advantage of video information, such as motion information and temporal coherence, and then use that information as additional constraints for spectral matting, we can achieve better results. In the future, we plan to extend the proposed method to video with non-static background using global motion estimation.

Source: University of California
Author: Kyoung-Rok Lee

Download Project

Get this Project:

Fields with * are mandatory

Leave a Comment

Your email address will not be published. Required fields are marked *