Graph cuts in computer vision

As applied in the field of computer vision, graph cut optimization can be employed to efficiently solve a wide variety of low-level computer vision problems (early vision), such as image smoothing, the stereo correspondence problem, image segmentation, and many other computer vision problems that can be formulated in terms of energy minimization. Many of these energy minimization problems can be approximated by solving a maximum flow problem in a graph (and thus, by the max-flow min-cut theorem, define a minimal cut of the graph). Under most formulations of such problems in computer vision, the minimum energy solution corresponds to the maximum a posteriori estimate of a solution. Although many computer vision algorithms involve cutting a graph (e.g., normalized cuts), the term 'graph cuts' is applied specifically to those models which employ a max-flow/min-cut optimization (other graph cutting algorithms may be considered as graph partitioning algorithms). As applied in the field of computer vision, graph cut optimization can be employed to efficiently solve a wide variety of low-level computer vision problems (early vision), such as image smoothing, the stereo correspondence problem, image segmentation, and many other computer vision problems that can be formulated in terms of energy minimization. Many of these energy minimization problems can be approximated by solving a maximum flow problem in a graph (and thus, by the max-flow min-cut theorem, define a minimal cut of the graph). Under most formulations of such problems in computer vision, the minimum energy solution corresponds to the maximum a posteriori estimate of a solution. Although many computer vision algorithms involve cutting a graph (e.g., normalized cuts), the term 'graph cuts' is applied specifically to those models which employ a max-flow/min-cut optimization (other graph cutting algorithms may be considered as graph partitioning algorithms). 'Binary' problems (such as denoising a binary image) can be solved exactly using this approach; problems where pixels can be labeled with more than two different labels (such as stereo correspondence, or denoising of a grayscale image) cannot be solved exactly, but solutions produced are usually near the global optimum. The theory of graph cuts used as an optimization method was first applied in computer vision in the seminal paper by Greig, Porteous and Seheult of Durham University. In the Bayesian statistical context of smoothing noisy (or corrupted) images, they showed how the maximum a posteriori estimate of a binary image can be obtained exactly by maximizing the flow through an associated image network, involving the introduction of a source and sink. The problem was therefore shown to be efficiently solvable. Prior to this result, approximate techniques such as simulated annealing (as proposed by the Geman brothers), or iterated conditional modes (a type of greedy algorithm as suggested by Julian Besag) were used to solve such image smoothing problems. Although the general k {displaystyle k} -colour problem remains unsolved for k > 2 , {displaystyle k>2,} the approach of Greig, Porteous and Seheult has turned out to have wide applicability in general computer vision problems. Greig, Porteous and Seheult approaches are often applied iteratively to a sequence of binary problems, usually yielding near optimal solutions. In 2011, C. Couprie et al. proposed a general image segmentation framework, called the 'Power Watershed', that minimized a real-valued indicator function from over a graph, constrained by user seeds (or unary terms) set to 0 or 1, in which the minimization of the indicator function over the graph is optimized with respect to an exponent p {displaystyle p} . When p = 1 {displaystyle p=1} , the Power Watershed is optimized by graph cuts, when p = 0 {displaystyle p=0} the Power Watershed is optimized by shortest paths, p = 2 {displaystyle p=2} is optimized by the Random walker algorithm and p = ∞ {displaystyle p=infty } is optimized by the Watershed (image processing) algorithm. In this way, the Power Watershed may be viewed as a generalization of graph cuts that provides a straightforward connection with other energy optimization segmentation/clustering algorithms. P r ( x | S ) = K {displaystyle Pr(x|S)=K} ( − E ) {displaystyle (-E)} where the energy E {displaystyle E} is composed of 2 different models ( E c o l o r {displaystyle E_{ m {color}}} and E c o h e r e n c e {displaystyle E_{ m {coherence}}} ): E c o l o r {displaystyle E_{ m {color}}} — unary term describing the likelihood of each color. E c o h e r e n c e {displaystyle E_{ m {coherence}}} — binary term describing the coherence between neighborhood pixels. Graph cuts methods have become popular alternatives to the level set-based approaches for optimizing the location of a contour (see for an extensive comparison). However, graph cut approaches have been criticized in the literature for several issues:

Parent Topic

Child Topic

No Parent Topic