Post on 17-Jun-2020
Lecture 1 - !!!
Philipp Krähenbühl!
Q&A of GrabCut
Philipp Krähenbühl
6-‐May-‐13 1
Lecture 1 - !!!
Philipp Krähenbühl!
Goal
6-‐May-‐13 2
• Bounding Box – provided
Lecture 1 - !!!
Philipp Krähenbühl!
Goal
6-‐May-‐13 3
• SegmentaFon of object within bounding box
Lecture 1 - !!!
Philipp Krähenbühl!
Overview
• Boykov & Jolly segmentaFon model
– SegmentaFon using Graph Cuts • GMM foreground and background model
– IteraFve opFmizaFon • Graph Cuts • GMM esFmaFon
6-‐May-‐13 4
E(α, z) = D(αn, zn )n∑ +γ wn,m[αn ≠αm ]
n,m∑
D(αn,θ, zn )
Lecture 1 - !!!
Philipp Krähenbühl! 6-‐May-‐13 5
Initialisation• User initialises trimap T by supplying only TB. The fore-ground is set to TF = /0; TU = TB, complement of the back-ground.
• Initialise αn = 0 for n ! TB and αn = 1 for n ! TU .• Background and foreground GMMs initialised from setsαn = 0 and αn = 1 respectively.
Iterative minimisation1. Assign GMM components to pixels: for each n in TU ,
kn := argminkn
Dn(αn,kn,θ ,zn).
2. Learn GMM parameters from data z:θ := argmin
θU(α,k,θ ,z)
3. Estimate segmentation: use min cut to solve:min
{αn: n!TU}minkE(α,k,θ ,z).
4. Repeat from step 1, until convergence.
5. Apply border matting (section 4).
User editing• Edit: fix some pixels either to αn = 0 (background brush)or αn = 1 (foreground brush); update trimap T accord-ingly. Perform step 3 above, just once.
• Refine operation: [optional] perform entire iterative min-imisation algorithm.
Figure 3: Iterative image segmentation in GrabCut
1 4 8 12
Energy E
RED
GREE
N
RED
GREE
N
(a) (b) (c)Figure 4: Convergence of iterative minimization for the data offig. 2f. (a) The energy E for the llama example converges over12 iterations. The GMM in RGB colour space (side-view showingR,G) at initialization (b) and after convergence (c). K = 5 mixturecomponents were used for both background (red) and foreground(blue). Initially (b) both GMMs overlap considerably, but are bet-ter separated after convergence (c), as the foreground/backgroundlabelling has become accurate.
3.3 User Interaction and incomplete trimaps
Incomplete trimaps. The iterative minimisation algorithm al-lows increased versatility of user interaction. In particular, incom-plete labelling becomes feasible where, in place of the full trimapT , the user needs only specify, say, the background region TB, leav-ing TF = 0. No hard foreground labelling is done at all. Iterativeminimisation (fig. 3) deals with this incompleteness by allowingprovisional labels on some pixels (in the foreground) which cansubsequently be retracted; only the background labels TB are takento be firm— guaranteed not to be retracted later. (Of course a com-plementary scheme, with firm labels for the foreground only, is alsoa possibility.) In our implementation, the initial TB is determined bythe user as a strip of pixels around the outside of the marked rect-angle (marked in red in fig. 2f).
Automatic�
Segmentation�
Automatic�
Segmentation�
U�s�e�r�
I�n�t�e�r�a�c�t�i�o�n�
Figure 5: User editing. After the initial user interaction and seg-mentation (top row), further user edits (fig. 3) are necessary. Mark-ing roughly with a foreground brush (white) and a backgroundbrush (red) is sufficient to obtain the desired result (bottom row).
Further user editing. The initial, incomplete user-labelling is of-ten sufficient to allow the entire segmentation to be completed au-tomatically, but by no means always. If not, further user editingis needed [Boykov and Jolly 2001], as shown in fig.5. It takes theform of brushing pixels, constraining them either to be firm fore-ground or firm background; then the minimisation step 3. in fig. 3is applied. Note that it is sufficient to brush, roughly, just part of awrongly labeled area. In addition, the optional “refine” operation offig. 3 updates the colour models, following user edits. This prop-agates the effect of edit operations which is frequently beneficial.Note that for efficiency the optimal flow, computed by Graph Cut,can be re-used during user edits.
4 Transparency
Given that a matting tool should be able to produce continuous al-pha values, we now describe a mechanism by which hard segmenta-tion, as described above, can be augmented by “border matting”, inwhich full transparency is allowed in a narrow strip around the hardsegmentation boundary. This is sufficient to deal with the problemof matting in the presence of blur and mixed pixels along smoothobject boundaries. The technical issues are: Estimating an alpha-map for the strip without generating artefacts, and recovering theforeground colour, free of colour bleeding from the background.
4.1 Border Matting
Border matting begins with a closed contourC, obtained by fitting apolyline to the segmentation boundary from the iterative hard seg-mentation of the previous section. A new trimap {TB,TU ,TF} iscomputed, in which TU is the set of pixels in a ribbon of width ±wpixels either side of C (we use w = 6). The goal is to compute themap αn, n ! TU , and in order to do this robustly, a strong modelis assumed for the shape of the α-profile within TU . The form ofthe model is based on [Mortensen and Barrett 1999] but with twoimportant additions: regularisation to enhance the quality of the es-timated α-map; and a dynamic programming (DP) algorithm forestimating α throughout TU .Let t = 1, . . . ,T be a parameterization of contourC, periodic with
period T , as curve C is closed. An index t(n) is assigned to eachpixel n ! TU , as in fig. 6(c). The α-profile is taken to be a soft step-function g (fig. 6c): αn = g
!
rn;Δt(n),σt(n)"
, where rn is a signeddistance from pixel n to contour C. Parameters Δ,σ determine thecentre and width respectively of the transition from 0 to 1 in the
Project 2
Not required
OpFonal
Lecture 1 - !!!
Philipp Krähenbühl!
NotaFon
• Trimaps – Sets of pixels – TB: background – TF: foreground – TU: undecided
• SegmentaFon – αi for each pixel i
6-‐May-‐13 6
Initialisation• User initialises trimap T by supplying only TB. The fore-ground is set to TF = /0; TU = TB, complement of the back-ground.
• Initialise αn = 0 for n ! TB and αn = 1 for n ! TU .• Background and foreground GMMs initialised from setsαn = 0 and αn = 1 respectively.
Iterative minimisation1. Assign GMM components to pixels: for each n in TU ,
kn := argminkn
Dn(αn,kn,θ ,zn).
2. Learn GMM parameters from data z:θ := argmin
θU(α,k,θ ,z)
3. Estimate segmentation: use min cut to solve:min
{αn: n!TU}minkE(α,k,θ ,z).
4. Repeat from step 1, until convergence.
5. Apply border matting (section 4).
User editing• Edit: fix some pixels either to αn = 0 (background brush)or αn = 1 (foreground brush); update trimap T accord-ingly. Perform step 3 above, just once.
• Refine operation: [optional] perform entire iterative min-imisation algorithm.
Figure 3: Iterative image segmentation in GrabCut
1 4 8 12
Energy E
RED
GREE
N
RED
GREE
N
(a) (b) (c)Figure 4: Convergence of iterative minimization for the data offig. 2f. (a) The energy E for the llama example converges over12 iterations. The GMM in RGB colour space (side-view showingR,G) at initialization (b) and after convergence (c). K = 5 mixturecomponents were used for both background (red) and foreground(blue). Initially (b) both GMMs overlap considerably, but are bet-ter separated after convergence (c), as the foreground/backgroundlabelling has become accurate.
3.3 User Interaction and incomplete trimaps
Incomplete trimaps. The iterative minimisation algorithm al-lows increased versatility of user interaction. In particular, incom-plete labelling becomes feasible where, in place of the full trimapT , the user needs only specify, say, the background region TB, leav-ing TF = 0. No hard foreground labelling is done at all. Iterativeminimisation (fig. 3) deals with this incompleteness by allowingprovisional labels on some pixels (in the foreground) which cansubsequently be retracted; only the background labels TB are takento be firm— guaranteed not to be retracted later. (Of course a com-plementary scheme, with firm labels for the foreground only, is alsoa possibility.) In our implementation, the initial TB is determined bythe user as a strip of pixels around the outside of the marked rect-angle (marked in red in fig. 2f).
Automatic�
Segmentation�
Automatic�
Segmentation�
U�s�e�r�
I�n�t�e�r�a�c�t�i�o�n�
Figure 5: User editing. After the initial user interaction and seg-mentation (top row), further user edits (fig. 3) are necessary. Mark-ing roughly with a foreground brush (white) and a backgroundbrush (red) is sufficient to obtain the desired result (bottom row).
Further user editing. The initial, incomplete user-labelling is of-ten sufficient to allow the entire segmentation to be completed au-tomatically, but by no means always. If not, further user editingis needed [Boykov and Jolly 2001], as shown in fig.5. It takes theform of brushing pixels, constraining them either to be firm fore-ground or firm background; then the minimisation step 3. in fig. 3is applied. Note that it is sufficient to brush, roughly, just part of awrongly labeled area. In addition, the optional “refine” operation offig. 3 updates the colour models, following user edits. This prop-agates the effect of edit operations which is frequently beneficial.Note that for efficiency the optimal flow, computed by Graph Cut,can be re-used during user edits.
4 Transparency
Given that a matting tool should be able to produce continuous al-pha values, we now describe a mechanism by which hard segmenta-tion, as described above, can be augmented by “border matting”, inwhich full transparency is allowed in a narrow strip around the hardsegmentation boundary. This is sufficient to deal with the problemof matting in the presence of blur and mixed pixels along smoothobject boundaries. The technical issues are: Estimating an alpha-map for the strip without generating artefacts, and recovering theforeground colour, free of colour bleeding from the background.
4.1 Border Matting
Border matting begins with a closed contourC, obtained by fitting apolyline to the segmentation boundary from the iterative hard seg-mentation of the previous section. A new trimap {TB,TU ,TF} iscomputed, in which TU is the set of pixels in a ribbon of width ±wpixels either side of C (we use w = 6). The goal is to compute themap αn, n ! TU , and in order to do this robustly, a strong modelis assumed for the shape of the α-profile within TU . The form ofthe model is based on [Mortensen and Barrett 1999] but with twoimportant additions: regularisation to enhance the quality of the es-timated α-map; and a dynamic programming (DP) algorithm forestimating α throughout TU .Let t = 1, . . . ,T be a parameterization of contourC, periodic with
period T , as curve C is closed. An index t(n) is assigned to eachpixel n ! TU , as in fig. 6(c). The α-profile is taken to be a soft step-function g (fig. 6c): αn = g
!
rn;Δt(n),σt(n)"
, where rn is a signeddistance from pixel n to contour C. Parameters Δ,σ determine thecentre and width respectively of the transition from 0 to 1 in the
Lecture 1 - !!!
Philipp Krähenbühl!
IniFalizaFon
• Outside – Background (fixed) – αn=0
• Inside – IniFal foreground – αn=1 – updated
6-‐May-‐13 7
Initialisation• User initialises trimap T by supplying only TB. The fore-ground is set to TF = /0; TU = TB, complement of the back-ground.
• Initialise αn = 0 for n ! TB and αn = 1 for n ! TU .• Background and foreground GMMs initialised from setsαn = 0 and αn = 1 respectively.
Iterative minimisation1. Assign GMM components to pixels: for each n in TU ,
kn := argminkn
Dn(αn,kn,θ ,zn).
2. Learn GMM parameters from data z:θ := argmin
θU(α,k,θ ,z)
3. Estimate segmentation: use min cut to solve:min
{αn: n!TU}minkE(α,k,θ ,z).
4. Repeat from step 1, until convergence.
5. Apply border matting (section 4).
User editing• Edit: fix some pixels either to αn = 0 (background brush)or αn = 1 (foreground brush); update trimap T accord-ingly. Perform step 3 above, just once.
• Refine operation: [optional] perform entire iterative min-imisation algorithm.
Figure 3: Iterative image segmentation in GrabCut
1 4 8 12
Energy E
RED
GREE
N
RED
GREE
N
(a) (b) (c)Figure 4: Convergence of iterative minimization for the data offig. 2f. (a) The energy E for the llama example converges over12 iterations. The GMM in RGB colour space (side-view showingR,G) at initialization (b) and after convergence (c). K = 5 mixturecomponents were used for both background (red) and foreground(blue). Initially (b) both GMMs overlap considerably, but are bet-ter separated after convergence (c), as the foreground/backgroundlabelling has become accurate.
3.3 User Interaction and incomplete trimaps
Incomplete trimaps. The iterative minimisation algorithm al-lows increased versatility of user interaction. In particular, incom-plete labelling becomes feasible where, in place of the full trimapT , the user needs only specify, say, the background region TB, leav-ing TF = 0. No hard foreground labelling is done at all. Iterativeminimisation (fig. 3) deals with this incompleteness by allowingprovisional labels on some pixels (in the foreground) which cansubsequently be retracted; only the background labels TB are takento be firm— guaranteed not to be retracted later. (Of course a com-plementary scheme, with firm labels for the foreground only, is alsoa possibility.) In our implementation, the initial TB is determined bythe user as a strip of pixels around the outside of the marked rect-angle (marked in red in fig. 2f).
Automatic�
Segmentation�
Automatic�
Segmentation�
U�s�e�r�
I�n�t�e�r�a�c�t�i�o�n�
Figure 5: User editing. After the initial user interaction and seg-mentation (top row), further user edits (fig. 3) are necessary. Mark-ing roughly with a foreground brush (white) and a backgroundbrush (red) is sufficient to obtain the desired result (bottom row).
Further user editing. The initial, incomplete user-labelling is of-ten sufficient to allow the entire segmentation to be completed au-tomatically, but by no means always. If not, further user editingis needed [Boykov and Jolly 2001], as shown in fig.5. It takes theform of brushing pixels, constraining them either to be firm fore-ground or firm background; then the minimisation step 3. in fig. 3is applied. Note that it is sufficient to brush, roughly, just part of awrongly labeled area. In addition, the optional “refine” operation offig. 3 updates the colour models, following user edits. This prop-agates the effect of edit operations which is frequently beneficial.Note that for efficiency the optimal flow, computed by Graph Cut,can be re-used during user edits.
4 Transparency
Given that a matting tool should be able to produce continuous al-pha values, we now describe a mechanism by which hard segmenta-tion, as described above, can be augmented by “border matting”, inwhich full transparency is allowed in a narrow strip around the hardsegmentation boundary. This is sufficient to deal with the problemof matting in the presence of blur and mixed pixels along smoothobject boundaries. The technical issues are: Estimating an alpha-map for the strip without generating artefacts, and recovering theforeground colour, free of colour bleeding from the background.
4.1 Border Matting
Border matting begins with a closed contourC, obtained by fitting apolyline to the segmentation boundary from the iterative hard seg-mentation of the previous section. A new trimap {TB,TU ,TF} iscomputed, in which TU is the set of pixels in a ribbon of width ±wpixels either side of C (we use w = 6). The goal is to compute themap αn, n ! TU , and in order to do this robustly, a strong modelis assumed for the shape of the α-profile within TU . The form ofthe model is based on [Mortensen and Barrett 1999] but with twoimportant additions: regularisation to enhance the quality of the es-timated α-map; and a dynamic programming (DP) algorithm forestimating α throughout TU .Let t = 1, . . . ,T be a parameterization of contourC, periodic with
period T , as curve C is closed. An index t(n) is assigned to eachpixel n ! TU , as in fig. 6(c). The α-profile is taken to be a soft step-function g (fig. 6c): αn = g
!
rn;Δt(n),σt(n)"
, where rn is a signeddistance from pixel n to contour C. Parameters Δ,σ determine thecentre and width respectively of the transition from 0 to 1 in the
Lecture 1 - !!!
Philipp Krähenbühl!
GMM IniFalizaFon
• Find parameters θ • Standard method – random + EM
• Re-‐esFmate θ in step 2 – IniFalizaFon only needed for step 1
– Trick: IniFalize k instead
6-‐May-‐13 8
Initialisation• User initialises trimap T by supplying only TB. The fore-ground is set to TF = /0; TU = TB, complement of the back-ground.
• Initialise αn = 0 for n ! TB and αn = 1 for n ! TU .• Background and foreground GMMs initialised from setsαn = 0 and αn = 1 respectively.
Iterative minimisation1. Assign GMM components to pixels: for each n in TU ,
kn := argminkn
Dn(αn,kn,θ ,zn).
2. Learn GMM parameters from data z:θ := argmin
θU(α,k,θ ,z)
3. Estimate segmentation: use min cut to solve:min
{αn: n!TU}minkE(α,k,θ ,z).
4. Repeat from step 1, until convergence.
5. Apply border matting (section 4).
User editing• Edit: fix some pixels either to αn = 0 (background brush)or αn = 1 (foreground brush); update trimap T accord-ingly. Perform step 3 above, just once.
• Refine operation: [optional] perform entire iterative min-imisation algorithm.
Figure 3: Iterative image segmentation in GrabCut
1 4 8 12
Energy E
RED
GREE
N
RED
GREE
N
(a) (b) (c)Figure 4: Convergence of iterative minimization for the data offig. 2f. (a) The energy E for the llama example converges over12 iterations. The GMM in RGB colour space (side-view showingR,G) at initialization (b) and after convergence (c). K = 5 mixturecomponents were used for both background (red) and foreground(blue). Initially (b) both GMMs overlap considerably, but are bet-ter separated after convergence (c), as the foreground/backgroundlabelling has become accurate.
3.3 User Interaction and incomplete trimaps
Incomplete trimaps. The iterative minimisation algorithm al-lows increased versatility of user interaction. In particular, incom-plete labelling becomes feasible where, in place of the full trimapT , the user needs only specify, say, the background region TB, leav-ing TF = 0. No hard foreground labelling is done at all. Iterativeminimisation (fig. 3) deals with this incompleteness by allowingprovisional labels on some pixels (in the foreground) which cansubsequently be retracted; only the background labels TB are takento be firm— guaranteed not to be retracted later. (Of course a com-plementary scheme, with firm labels for the foreground only, is alsoa possibility.) In our implementation, the initial TB is determined bythe user as a strip of pixels around the outside of the marked rect-angle (marked in red in fig. 2f).
Automatic�
Segmentation�
Automatic�
Segmentation�
U�s�e�r�
I�n�t�e�r�a�c�t�i�o�n�
Figure 5: User editing. After the initial user interaction and seg-mentation (top row), further user edits (fig. 3) are necessary. Mark-ing roughly with a foreground brush (white) and a backgroundbrush (red) is sufficient to obtain the desired result (bottom row).
Further user editing. The initial, incomplete user-labelling is of-ten sufficient to allow the entire segmentation to be completed au-tomatically, but by no means always. If not, further user editingis needed [Boykov and Jolly 2001], as shown in fig.5. It takes theform of brushing pixels, constraining them either to be firm fore-ground or firm background; then the minimisation step 3. in fig. 3is applied. Note that it is sufficient to brush, roughly, just part of awrongly labeled area. In addition, the optional “refine” operation offig. 3 updates the colour models, following user edits. This prop-agates the effect of edit operations which is frequently beneficial.Note that for efficiency the optimal flow, computed by Graph Cut,can be re-used during user edits.
4 Transparency
Given that a matting tool should be able to produce continuous al-pha values, we now describe a mechanism by which hard segmenta-tion, as described above, can be augmented by “border matting”, inwhich full transparency is allowed in a narrow strip around the hardsegmentation boundary. This is sufficient to deal with the problemof matting in the presence of blur and mixed pixels along smoothobject boundaries. The technical issues are: Estimating an alpha-map for the strip without generating artefacts, and recovering theforeground colour, free of colour bleeding from the background.
4.1 Border Matting
Border matting begins with a closed contourC, obtained by fitting apolyline to the segmentation boundary from the iterative hard seg-mentation of the previous section. A new trimap {TB,TU ,TF} iscomputed, in which TU is the set of pixels in a ribbon of width ±wpixels either side of C (we use w = 6). The goal is to compute themap αn, n ! TU , and in order to do this robustly, a strong modelis assumed for the shape of the α-profile within TU . The form ofthe model is based on [Mortensen and Barrett 1999] but with twoimportant additions: regularisation to enhance the quality of the es-timated α-map; and a dynamic programming (DP) algorithm forestimating α throughout TU .Let t = 1, . . . ,T be a parameterization of contourC, periodic with
period T , as curve C is closed. An index t(n) is assigned to eachpixel n ! TU , as in fig. 6(c). The α-profile is taken to be a soft step-function g (fig. 6c): αn = g
!
rn;Δt(n),σt(n)"
, where rn is a signeddistance from pixel n to contour C. Parameters Δ,σ determine thecentre and width respectively of the transition from 0 to 1 in the
Lecture 1 - !!!
Philipp Krähenbühl!
GMM IniFalizaFon
• Trick: IniFalize k – k-‐means clustering – skip step 1 in first iteraFon
6-‐May-‐13 9
Initialisation• User initialises trimap T by supplying only TB. The fore-ground is set to TF = /0; TU = TB, complement of the back-ground.
• Initialise αn = 0 for n ! TB and αn = 1 for n ! TU .• Background and foreground GMMs initialised from setsαn = 0 and αn = 1 respectively.
Iterative minimisation1. Assign GMM components to pixels: for each n in TU ,
kn := argminkn
Dn(αn,kn,θ ,zn).
2. Learn GMM parameters from data z:θ := argmin
θU(α,k,θ ,z)
3. Estimate segmentation: use min cut to solve:min
{αn: n!TU}minkE(α,k,θ ,z).
4. Repeat from step 1, until convergence.
5. Apply border matting (section 4).
User editing• Edit: fix some pixels either to αn = 0 (background brush)or αn = 1 (foreground brush); update trimap T accord-ingly. Perform step 3 above, just once.
• Refine operation: [optional] perform entire iterative min-imisation algorithm.
Figure 3: Iterative image segmentation in GrabCut
1 4 8 12
Energy E
RED
GREE
N
RED
GREE
N
(a) (b) (c)Figure 4: Convergence of iterative minimization for the data offig. 2f. (a) The energy E for the llama example converges over12 iterations. The GMM in RGB colour space (side-view showingR,G) at initialization (b) and after convergence (c). K = 5 mixturecomponents were used for both background (red) and foreground(blue). Initially (b) both GMMs overlap considerably, but are bet-ter separated after convergence (c), as the foreground/backgroundlabelling has become accurate.
3.3 User Interaction and incomplete trimaps
Incomplete trimaps. The iterative minimisation algorithm al-lows increased versatility of user interaction. In particular, incom-plete labelling becomes feasible where, in place of the full trimapT , the user needs only specify, say, the background region TB, leav-ing TF = 0. No hard foreground labelling is done at all. Iterativeminimisation (fig. 3) deals with this incompleteness by allowingprovisional labels on some pixels (in the foreground) which cansubsequently be retracted; only the background labels TB are takento be firm— guaranteed not to be retracted later. (Of course a com-plementary scheme, with firm labels for the foreground only, is alsoa possibility.) In our implementation, the initial TB is determined bythe user as a strip of pixels around the outside of the marked rect-angle (marked in red in fig. 2f).
Automatic�
Segmentation�
Automatic�
Segmentation�
U�s�e�r�
I�n�t�e�r�a�c�t�i�o�n�
Figure 5: User editing. After the initial user interaction and seg-mentation (top row), further user edits (fig. 3) are necessary. Mark-ing roughly with a foreground brush (white) and a backgroundbrush (red) is sufficient to obtain the desired result (bottom row).
Further user editing. The initial, incomplete user-labelling is of-ten sufficient to allow the entire segmentation to be completed au-tomatically, but by no means always. If not, further user editingis needed [Boykov and Jolly 2001], as shown in fig.5. It takes theform of brushing pixels, constraining them either to be firm fore-ground or firm background; then the minimisation step 3. in fig. 3is applied. Note that it is sufficient to brush, roughly, just part of awrongly labeled area. In addition, the optional “refine” operation offig. 3 updates the colour models, following user edits. This prop-agates the effect of edit operations which is frequently beneficial.Note that for efficiency the optimal flow, computed by Graph Cut,can be re-used during user edits.
4 Transparency
Given that a matting tool should be able to produce continuous al-pha values, we now describe a mechanism by which hard segmenta-tion, as described above, can be augmented by “border matting”, inwhich full transparency is allowed in a narrow strip around the hardsegmentation boundary. This is sufficient to deal with the problemof matting in the presence of blur and mixed pixels along smoothobject boundaries. The technical issues are: Estimating an alpha-map for the strip without generating artefacts, and recovering theforeground colour, free of colour bleeding from the background.
4.1 Border Matting
Border matting begins with a closed contourC, obtained by fitting apolyline to the segmentation boundary from the iterative hard seg-mentation of the previous section. A new trimap {TB,TU ,TF} iscomputed, in which TU is the set of pixels in a ribbon of width ±wpixels either side of C (we use w = 6). The goal is to compute themap αn, n ! TU , and in order to do this robustly, a strong modelis assumed for the shape of the α-profile within TU . The form ofthe model is based on [Mortensen and Barrett 1999] but with twoimportant additions: regularisation to enhance the quality of the es-timated α-map; and a dynamic programming (DP) algorithm forestimating α throughout TU .Let t = 1, . . . ,T be a parameterization of contourC, periodic with
period T , as curve C is closed. An index t(n) is assigned to eachpixel n ! TU , as in fig. 6(c). The α-profile is taken to be a soft step-function g (fig. 6c): αn = g
!
rn;Δt(n),σt(n)"
, where rn is a signeddistance from pixel n to contour C. Parameters Δ,σ determine thecentre and width respectively of the transition from 0 to 1 in the
Lecture 1 - !!!
Philipp Krähenbühl!
GMM assignment
• Gaussian log prob.
• Enumerate all kn • Each pixel already assigned to FG or BG (αn fixed)
6-‐May-‐13 10
Initialisation• User initialises trimap T by supplying only TB. The fore-ground is set to TF = /0; TU = TB, complement of the back-ground.
• Initialise αn = 0 for n ! TB and αn = 1 for n ! TU .• Background and foreground GMMs initialised from setsαn = 0 and αn = 1 respectively.
Iterative minimisation1. Assign GMM components to pixels: for each n in TU ,
kn := argminkn
Dn(αn,kn,θ ,zn).
2. Learn GMM parameters from data z:θ := argmin
θU(α,k,θ ,z)
3. Estimate segmentation: use min cut to solve:min
{αn: n!TU}minkE(α,k,θ ,z).
4. Repeat from step 1, until convergence.
5. Apply border matting (section 4).
User editing• Edit: fix some pixels either to αn = 0 (background brush)or αn = 1 (foreground brush); update trimap T accord-ingly. Perform step 3 above, just once.
• Refine operation: [optional] perform entire iterative min-imisation algorithm.
Figure 3: Iterative image segmentation in GrabCut
1 4 8 12
Energy E
RED
GREE
N
RED
GREE
N
(a) (b) (c)Figure 4: Convergence of iterative minimization for the data offig. 2f. (a) The energy E for the llama example converges over12 iterations. The GMM in RGB colour space (side-view showingR,G) at initialization (b) and after convergence (c). K = 5 mixturecomponents were used for both background (red) and foreground(blue). Initially (b) both GMMs overlap considerably, but are bet-ter separated after convergence (c), as the foreground/backgroundlabelling has become accurate.
3.3 User Interaction and incomplete trimaps
Incomplete trimaps. The iterative minimisation algorithm al-lows increased versatility of user interaction. In particular, incom-plete labelling becomes feasible where, in place of the full trimapT , the user needs only specify, say, the background region TB, leav-ing TF = 0. No hard foreground labelling is done at all. Iterativeminimisation (fig. 3) deals with this incompleteness by allowingprovisional labels on some pixels (in the foreground) which cansubsequently be retracted; only the background labels TB are takento be firm— guaranteed not to be retracted later. (Of course a com-plementary scheme, with firm labels for the foreground only, is alsoa possibility.) In our implementation, the initial TB is determined bythe user as a strip of pixels around the outside of the marked rect-angle (marked in red in fig. 2f).
Automatic�
Segmentation�
Automatic�
Segmentation�
U�s�e�r�
I�n�t�e�r�a�c�t�i�o�n�
Figure 5: User editing. After the initial user interaction and seg-mentation (top row), further user edits (fig. 3) are necessary. Mark-ing roughly with a foreground brush (white) and a backgroundbrush (red) is sufficient to obtain the desired result (bottom row).
Further user editing. The initial, incomplete user-labelling is of-ten sufficient to allow the entire segmentation to be completed au-tomatically, but by no means always. If not, further user editingis needed [Boykov and Jolly 2001], as shown in fig.5. It takes theform of brushing pixels, constraining them either to be firm fore-ground or firm background; then the minimisation step 3. in fig. 3is applied. Note that it is sufficient to brush, roughly, just part of awrongly labeled area. In addition, the optional “refine” operation offig. 3 updates the colour models, following user edits. This prop-agates the effect of edit operations which is frequently beneficial.Note that for efficiency the optimal flow, computed by Graph Cut,can be re-used during user edits.
4 Transparency
Given that a matting tool should be able to produce continuous al-pha values, we now describe a mechanism by which hard segmenta-tion, as described above, can be augmented by “border matting”, inwhich full transparency is allowed in a narrow strip around the hardsegmentation boundary. This is sufficient to deal with the problemof matting in the presence of blur and mixed pixels along smoothobject boundaries. The technical issues are: Estimating an alpha-map for the strip without generating artefacts, and recovering theforeground colour, free of colour bleeding from the background.
4.1 Border Matting
Border matting begins with a closed contourC, obtained by fitting apolyline to the segmentation boundary from the iterative hard seg-mentation of the previous section. A new trimap {TB,TU ,TF} iscomputed, in which TU is the set of pixels in a ribbon of width ±wpixels either side of C (we use w = 6). The goal is to compute themap αn, n ! TU , and in order to do this robustly, a strong modelis assumed for the shape of the α-profile within TU . The form ofthe model is based on [Mortensen and Barrett 1999] but with twoimportant additions: regularisation to enhance the quality of the es-timated α-map; and a dynamic programming (DP) algorithm forestimating α throughout TU .Let t = 1, . . . ,T be a parameterization of contourC, periodic with
period T , as curve C is closed. An index t(n) is assigned to eachpixel n ! TU , as in fig. 6(c). The α-profile is taken to be a soft step-function g (fig. 6c): αn = g
!
rn;Δt(n),σt(n)"
, where rn is a signeddistance from pixel n to contour C. Parameters Δ,σ determine thecentre and width respectively of the transition from 0 to 1 in the
Dn (αn,kn,θ, zn ) =
− logπ (αn,kn )+12logdetΣ(αn,kn )
+12zn −µ(αn,kn )[ ]T Σ(αn,kn )
−1 zn −µ(αn,kn )[ ]
Lecture 1 - !!!
Philipp Krähenbühl!
GMM leaning
• Mixture param.
• F(k) set of FG pixels assigned to comp. k
6-‐May-‐13 11
Initialisation• User initialises trimap T by supplying only TB. The fore-ground is set to TF = /0; TU = TB, complement of the back-ground.
• Initialise αn = 0 for n ! TB and αn = 1 for n ! TU .• Background and foreground GMMs initialised from setsαn = 0 and αn = 1 respectively.
Iterative minimisation1. Assign GMM components to pixels: for each n in TU ,
kn := argminkn
Dn(αn,kn,θ ,zn).
2. Learn GMM parameters from data z:θ := argmin
θU(α,k,θ ,z)
3. Estimate segmentation: use min cut to solve:min
{αn: n!TU}minkE(α,k,θ ,z).
4. Repeat from step 1, until convergence.
5. Apply border matting (section 4).
User editing• Edit: fix some pixels either to αn = 0 (background brush)or αn = 1 (foreground brush); update trimap T accord-ingly. Perform step 3 above, just once.
• Refine operation: [optional] perform entire iterative min-imisation algorithm.
Figure 3: Iterative image segmentation in GrabCut
1 4 8 12
Energy E
RED
GREE
N
RED
GREE
N
(a) (b) (c)Figure 4: Convergence of iterative minimization for the data offig. 2f. (a) The energy E for the llama example converges over12 iterations. The GMM in RGB colour space (side-view showingR,G) at initialization (b) and after convergence (c). K = 5 mixturecomponents were used for both background (red) and foreground(blue). Initially (b) both GMMs overlap considerably, but are bet-ter separated after convergence (c), as the foreground/backgroundlabelling has become accurate.
3.3 User Interaction and incomplete trimaps
Incomplete trimaps. The iterative minimisation algorithm al-lows increased versatility of user interaction. In particular, incom-plete labelling becomes feasible where, in place of the full trimapT , the user needs only specify, say, the background region TB, leav-ing TF = 0. No hard foreground labelling is done at all. Iterativeminimisation (fig. 3) deals with this incompleteness by allowingprovisional labels on some pixels (in the foreground) which cansubsequently be retracted; only the background labels TB are takento be firm— guaranteed not to be retracted later. (Of course a com-plementary scheme, with firm labels for the foreground only, is alsoa possibility.) In our implementation, the initial TB is determined bythe user as a strip of pixels around the outside of the marked rect-angle (marked in red in fig. 2f).
Automatic�
Segmentation�
Automatic�
Segmentation�
U�s�e�r�
I�n�t�e�r�a�c�t�i�o�n�
Figure 5: User editing. After the initial user interaction and seg-mentation (top row), further user edits (fig. 3) are necessary. Mark-ing roughly with a foreground brush (white) and a backgroundbrush (red) is sufficient to obtain the desired result (bottom row).
Further user editing. The initial, incomplete user-labelling is of-ten sufficient to allow the entire segmentation to be completed au-tomatically, but by no means always. If not, further user editingis needed [Boykov and Jolly 2001], as shown in fig.5. It takes theform of brushing pixels, constraining them either to be firm fore-ground or firm background; then the minimisation step 3. in fig. 3is applied. Note that it is sufficient to brush, roughly, just part of awrongly labeled area. In addition, the optional “refine” operation offig. 3 updates the colour models, following user edits. This prop-agates the effect of edit operations which is frequently beneficial.Note that for efficiency the optimal flow, computed by Graph Cut,can be re-used during user edits.
4 Transparency
Given that a matting tool should be able to produce continuous al-pha values, we now describe a mechanism by which hard segmenta-tion, as described above, can be augmented by “border matting”, inwhich full transparency is allowed in a narrow strip around the hardsegmentation boundary. This is sufficient to deal with the problemof matting in the presence of blur and mixed pixels along smoothobject boundaries. The technical issues are: Estimating an alpha-map for the strip without generating artefacts, and recovering theforeground colour, free of colour bleeding from the background.
4.1 Border Matting
Border matting begins with a closed contourC, obtained by fitting apolyline to the segmentation boundary from the iterative hard seg-mentation of the previous section. A new trimap {TB,TU ,TF} iscomputed, in which TU is the set of pixels in a ribbon of width ±wpixels either side of C (we use w = 6). The goal is to compute themap αn, n ! TU , and in order to do this robustly, a strong modelis assumed for the shape of the α-profile within TU . The form ofthe model is based on [Mortensen and Barrett 1999] but with twoimportant additions: regularisation to enhance the quality of the es-timated α-map; and a dynamic programming (DP) algorithm forestimating α throughout TU .Let t = 1, . . . ,T be a parameterization of contourC, periodic with
period T , as curve C is closed. An index t(n) is assigned to eachpixel n ! TU , as in fig. 6(c). The α-profile is taken to be a soft step-function g (fig. 6c): αn = g
!
rn;Δt(n),σt(n)"
, where rn is a signeddistance from pixel n to contour C. Parameters Δ,σ determine thecentre and width respectively of the transition from 0 to 1 in the
π (α =1,k) =F(k)F(k)
k∑
µ(α =1,k) =meann∈F (k )
(zn )
Σ(α =1,k) = covn∈F (k )
(zn )
Lecture 1 - !!!
Philipp Krähenbühl!
SegmentaFon
• Find segmentaFon that minimizes
– Reduces to Boykov & Jolly
– Solved using GraphCut
6-‐May-‐13 12
Initialisation• User initialises trimap T by supplying only TB. The fore-ground is set to TF = /0; TU = TB, complement of the back-ground.
• Initialise αn = 0 for n ! TB and αn = 1 for n ! TU .• Background and foreground GMMs initialised from setsαn = 0 and αn = 1 respectively.
Iterative minimisation1. Assign GMM components to pixels: for each n in TU ,
kn := argminkn
Dn(αn,kn,θ ,zn).
2. Learn GMM parameters from data z:θ := argmin
θU(α,k,θ ,z)
3. Estimate segmentation: use min cut to solve:min
{αn: n!TU}minkE(α,k,θ ,z).
4. Repeat from step 1, until convergence.
5. Apply border matting (section 4).
User editing• Edit: fix some pixels either to αn = 0 (background brush)or αn = 1 (foreground brush); update trimap T accord-ingly. Perform step 3 above, just once.
• Refine operation: [optional] perform entire iterative min-imisation algorithm.
Figure 3: Iterative image segmentation in GrabCut
1 4 8 12
Energy E
RED
GREE
N
RED
GREE
N
(a) (b) (c)Figure 4: Convergence of iterative minimization for the data offig. 2f. (a) The energy E for the llama example converges over12 iterations. The GMM in RGB colour space (side-view showingR,G) at initialization (b) and after convergence (c). K = 5 mixturecomponents were used for both background (red) and foreground(blue). Initially (b) both GMMs overlap considerably, but are bet-ter separated after convergence (c), as the foreground/backgroundlabelling has become accurate.
3.3 User Interaction and incomplete trimaps
Incomplete trimaps. The iterative minimisation algorithm al-lows increased versatility of user interaction. In particular, incom-plete labelling becomes feasible where, in place of the full trimapT , the user needs only specify, say, the background region TB, leav-ing TF = 0. No hard foreground labelling is done at all. Iterativeminimisation (fig. 3) deals with this incompleteness by allowingprovisional labels on some pixels (in the foreground) which cansubsequently be retracted; only the background labels TB are takento be firm— guaranteed not to be retracted later. (Of course a com-plementary scheme, with firm labels for the foreground only, is alsoa possibility.) In our implementation, the initial TB is determined bythe user as a strip of pixels around the outside of the marked rect-angle (marked in red in fig. 2f).
Automatic�
Segmentation�
Automatic�
Segmentation�
U�s�e�r�
I�n�t�e�r�a�c�t�i�o�n�
Figure 5: User editing. After the initial user interaction and seg-mentation (top row), further user edits (fig. 3) are necessary. Mark-ing roughly with a foreground brush (white) and a backgroundbrush (red) is sufficient to obtain the desired result (bottom row).
Further user editing. The initial, incomplete user-labelling is of-ten sufficient to allow the entire segmentation to be completed au-tomatically, but by no means always. If not, further user editingis needed [Boykov and Jolly 2001], as shown in fig.5. It takes theform of brushing pixels, constraining them either to be firm fore-ground or firm background; then the minimisation step 3. in fig. 3is applied. Note that it is sufficient to brush, roughly, just part of awrongly labeled area. In addition, the optional “refine” operation offig. 3 updates the colour models, following user edits. This prop-agates the effect of edit operations which is frequently beneficial.Note that for efficiency the optimal flow, computed by Graph Cut,can be re-used during user edits.
4 Transparency
Given that a matting tool should be able to produce continuous al-pha values, we now describe a mechanism by which hard segmenta-tion, as described above, can be augmented by “border matting”, inwhich full transparency is allowed in a narrow strip around the hardsegmentation boundary. This is sufficient to deal with the problemof matting in the presence of blur and mixed pixels along smoothobject boundaries. The technical issues are: Estimating an alpha-map for the strip without generating artefacts, and recovering theforeground colour, free of colour bleeding from the background.
4.1 Border Matting
Border matting begins with a closed contourC, obtained by fitting apolyline to the segmentation boundary from the iterative hard seg-mentation of the previous section. A new trimap {TB,TU ,TF} iscomputed, in which TU is the set of pixels in a ribbon of width ±wpixels either side of C (we use w = 6). The goal is to compute themap αn, n ! TU , and in order to do this robustly, a strong modelis assumed for the shape of the α-profile within TU . The form ofthe model is based on [Mortensen and Barrett 1999] but with twoimportant additions: regularisation to enhance the quality of the es-timated α-map; and a dynamic programming (DP) algorithm forestimating α throughout TU .Let t = 1, . . . ,T be a parameterization of contourC, periodic with
period T , as curve C is closed. An index t(n) is assigned to eachpixel n ! TU , as in fig. 6(c). The α-profile is taken to be a soft step-function g (fig. 6c): αn = g
!
rn;Δt(n),σt(n)"
, where rn is a signeddistance from pixel n to contour C. Parameters Δ,σ determine thecentre and width respectively of the transition from 0 to 1 in the
minkE(α,k,θ, z) =
minknD(αn,kn,θ, zn )
D(αn ,θ ,zn ) n
∑ +γ wn,m[αn ≠αm ]n,m∑
Lecture 1 - !!!
Philipp Krähenbühl!
Use energy.h and maxflow.cpp // initialization !std::vector<Energy::Var> vars(N); !Energy e; !!// add a node !vars[i] = e.add_variable(); !!// add the unary term for a node !e.add_term1(vars[i], u0, u1); !// add the pairwise term for an edge !e.add_term2(vars[i], vars[j], p00, p01, p10, p11); !!// perform energy minimization !Energy::TotalValue mnE = e.minimize(); !!// get new labels !if (e.get_var(vars[i])) label[i] = 1; !else label[i] = 0; !
6-‐May-‐13 13
Initialisation• User initialises trimap T by supplying only TB. The fore-ground is set to TF = /0; TU = TB, complement of the back-ground.
• Initialise αn = 0 for n ! TB and αn = 1 for n ! TU .• Background and foreground GMMs initialised from setsαn = 0 and αn = 1 respectively.
Iterative minimisation1. Assign GMM components to pixels: for each n in TU ,
kn := argminkn
Dn(αn,kn,θ ,zn).
2. Learn GMM parameters from data z:θ := argmin
θU(α,k,θ ,z)
3. Estimate segmentation: use min cut to solve:min
{αn: n!TU}minkE(α,k,θ ,z).
4. Repeat from step 1, until convergence.
5. Apply border matting (section 4).
User editing• Edit: fix some pixels either to αn = 0 (background brush)or αn = 1 (foreground brush); update trimap T accord-ingly. Perform step 3 above, just once.
• Refine operation: [optional] perform entire iterative min-imisation algorithm.
Figure 3: Iterative image segmentation in GrabCut
1 4 8 12
Energy E
RED
GREE
N
RED
GREE
N
(a) (b) (c)Figure 4: Convergence of iterative minimization for the data offig. 2f. (a) The energy E for the llama example converges over12 iterations. The GMM in RGB colour space (side-view showingR,G) at initialization (b) and after convergence (c). K = 5 mixturecomponents were used for both background (red) and foreground(blue). Initially (b) both GMMs overlap considerably, but are bet-ter separated after convergence (c), as the foreground/backgroundlabelling has become accurate.
3.3 User Interaction and incomplete trimaps
Incomplete trimaps. The iterative minimisation algorithm al-lows increased versatility of user interaction. In particular, incom-plete labelling becomes feasible where, in place of the full trimapT , the user needs only specify, say, the background region TB, leav-ing TF = 0. No hard foreground labelling is done at all. Iterativeminimisation (fig. 3) deals with this incompleteness by allowingprovisional labels on some pixels (in the foreground) which cansubsequently be retracted; only the background labels TB are takento be firm— guaranteed not to be retracted later. (Of course a com-plementary scheme, with firm labels for the foreground only, is alsoa possibility.) In our implementation, the initial TB is determined bythe user as a strip of pixels around the outside of the marked rect-angle (marked in red in fig. 2f).
Automatic�
Segmentation�
Automatic�
Segmentation�
U�s�e�r�
I�n�t�e�r�a�c�t�i�o�n�
Figure 5: User editing. After the initial user interaction and seg-mentation (top row), further user edits (fig. 3) are necessary. Mark-ing roughly with a foreground brush (white) and a backgroundbrush (red) is sufficient to obtain the desired result (bottom row).
Further user editing. The initial, incomplete user-labelling is of-ten sufficient to allow the entire segmentation to be completed au-tomatically, but by no means always. If not, further user editingis needed [Boykov and Jolly 2001], as shown in fig.5. It takes theform of brushing pixels, constraining them either to be firm fore-ground or firm background; then the minimisation step 3. in fig. 3is applied. Note that it is sufficient to brush, roughly, just part of awrongly labeled area. In addition, the optional “refine” operation offig. 3 updates the colour models, following user edits. This prop-agates the effect of edit operations which is frequently beneficial.Note that for efficiency the optimal flow, computed by Graph Cut,can be re-used during user edits.
4 Transparency
Given that a matting tool should be able to produce continuous al-pha values, we now describe a mechanism by which hard segmenta-tion, as described above, can be augmented by “border matting”, inwhich full transparency is allowed in a narrow strip around the hardsegmentation boundary. This is sufficient to deal with the problemof matting in the presence of blur and mixed pixels along smoothobject boundaries. The technical issues are: Estimating an alpha-map for the strip without generating artefacts, and recovering theforeground colour, free of colour bleeding from the background.
4.1 Border Matting
Border matting begins with a closed contourC, obtained by fitting apolyline to the segmentation boundary from the iterative hard seg-mentation of the previous section. A new trimap {TB,TU ,TF} iscomputed, in which TU is the set of pixels in a ribbon of width ±wpixels either side of C (we use w = 6). The goal is to compute themap αn, n ! TU , and in order to do this robustly, a strong modelis assumed for the shape of the α-profile within TU . The form ofthe model is based on [Mortensen and Barrett 1999] but with twoimportant additions: regularisation to enhance the quality of the es-timated α-map; and a dynamic programming (DP) algorithm forestimating α throughout TU .Let t = 1, . . . ,T be a parameterization of contourC, periodic with
period T , as curve C is closed. An index t(n) is assigned to eachpixel n ! TU , as in fig. 6(c). The α-profile is taken to be a soft step-function g (fig. 6c): αn = g
!
rn;Δt(n),σt(n)"
, where rn is a signeddistance from pixel n to contour C. Parameters Δ,σ determine thecentre and width respectively of the transition from 0 to 1 in the
minkE(α,k,θ, z) =
D(αn,θ, zn )n∑ +γ wn,m[αn ≠αm ]
n,m∑
D(0,θ, zn ) D(1,θ, zn )
Lecture 1 - !!!
Philipp Krähenbühl!
Use energy.h and maxflow.cpp // initialization !std::vector<Energy::Var> vars(N); !Energy e; !!// add a node !vars[i] = e.add_variable(); !!// add the unary term for a node !e.add_term1(vars[i], u0, u1); !// add the pairwise term for an edge !e.add_term2(vars[i], vars[j], p00, p01, p10, p11); !!// perform energy minimization !Energy::TotalValue mnE = e.minimize(); !!// get new labels !if (e->get_var(vars[i])) label[i] = 1; !else label[i] = 0; !
6-‐May-‐13 14
Initialisation• User initialises trimap T by supplying only TB. The fore-ground is set to TF = /0; TU = TB, complement of the back-ground.
• Initialise αn = 0 for n ! TB and αn = 1 for n ! TU .• Background and foreground GMMs initialised from setsαn = 0 and αn = 1 respectively.
Iterative minimisation1. Assign GMM components to pixels: for each n in TU ,
kn := argminkn
Dn(αn,kn,θ ,zn).
2. Learn GMM parameters from data z:θ := argmin
θU(α,k,θ ,z)
3. Estimate segmentation: use min cut to solve:min
{αn: n!TU}minkE(α,k,θ ,z).
4. Repeat from step 1, until convergence.
5. Apply border matting (section 4).
User editing• Edit: fix some pixels either to αn = 0 (background brush)or αn = 1 (foreground brush); update trimap T accord-ingly. Perform step 3 above, just once.
• Refine operation: [optional] perform entire iterative min-imisation algorithm.
Figure 3: Iterative image segmentation in GrabCut
1 4 8 12
Energy E
RED
GREE
N
RED
GREE
N
(a) (b) (c)Figure 4: Convergence of iterative minimization for the data offig. 2f. (a) The energy E for the llama example converges over12 iterations. The GMM in RGB colour space (side-view showingR,G) at initialization (b) and after convergence (c). K = 5 mixturecomponents were used for both background (red) and foreground(blue). Initially (b) both GMMs overlap considerably, but are bet-ter separated after convergence (c), as the foreground/backgroundlabelling has become accurate.
3.3 User Interaction and incomplete trimaps
Incomplete trimaps. The iterative minimisation algorithm al-lows increased versatility of user interaction. In particular, incom-plete labelling becomes feasible where, in place of the full trimapT , the user needs only specify, say, the background region TB, leav-ing TF = 0. No hard foreground labelling is done at all. Iterativeminimisation (fig. 3) deals with this incompleteness by allowingprovisional labels on some pixels (in the foreground) which cansubsequently be retracted; only the background labels TB are takento be firm— guaranteed not to be retracted later. (Of course a com-plementary scheme, with firm labels for the foreground only, is alsoa possibility.) In our implementation, the initial TB is determined bythe user as a strip of pixels around the outside of the marked rect-angle (marked in red in fig. 2f).
Automatic�
Segmentation�
Automatic�
Segmentation�
U�s�e�r�
I�n�t�e�r�a�c�t�i�o�n�
Figure 5: User editing. After the initial user interaction and seg-mentation (top row), further user edits (fig. 3) are necessary. Mark-ing roughly with a foreground brush (white) and a backgroundbrush (red) is sufficient to obtain the desired result (bottom row).
Further user editing. The initial, incomplete user-labelling is of-ten sufficient to allow the entire segmentation to be completed au-tomatically, but by no means always. If not, further user editingis needed [Boykov and Jolly 2001], as shown in fig.5. It takes theform of brushing pixels, constraining them either to be firm fore-ground or firm background; then the minimisation step 3. in fig. 3is applied. Note that it is sufficient to brush, roughly, just part of awrongly labeled area. In addition, the optional “refine” operation offig. 3 updates the colour models, following user edits. This prop-agates the effect of edit operations which is frequently beneficial.Note that for efficiency the optimal flow, computed by Graph Cut,can be re-used during user edits.
4 Transparency
Given that a matting tool should be able to produce continuous al-pha values, we now describe a mechanism by which hard segmenta-tion, as described above, can be augmented by “border matting”, inwhich full transparency is allowed in a narrow strip around the hardsegmentation boundary. This is sufficient to deal with the problemof matting in the presence of blur and mixed pixels along smoothobject boundaries. The technical issues are: Estimating an alpha-map for the strip without generating artefacts, and recovering theforeground colour, free of colour bleeding from the background.
4.1 Border Matting
Border matting begins with a closed contourC, obtained by fitting apolyline to the segmentation boundary from the iterative hard seg-mentation of the previous section. A new trimap {TB,TU ,TF} iscomputed, in which TU is the set of pixels in a ribbon of width ±wpixels either side of C (we use w = 6). The goal is to compute themap αn, n ! TU , and in order to do this robustly, a strong modelis assumed for the shape of the α-profile within TU . The form ofthe model is based on [Mortensen and Barrett 1999] but with twoimportant additions: regularisation to enhance the quality of the es-timated α-map; and a dynamic programming (DP) algorithm forestimating α throughout TU .Let t = 1, . . . ,T be a parameterization of contourC, periodic with
period T , as curve C is closed. An index t(n) is assigned to eachpixel n ! TU , as in fig. 6(c). The α-profile is taken to be a soft step-function g (fig. 6c): αn = g
!
rn;Δt(n),σt(n)"
, where rn is a signeddistance from pixel n to contour C. Parameters Δ,σ determine thecentre and width respectively of the transition from 0 to 1 in the
minkE(α,k,θ, z) =
D(αn,θ, zn )n∑ +γ wn,m[αn ≠αm ]
n,m∑
? ? ? ?
Lecture 1 - !!!
Philipp Krähenbühl!
Use energy.h and maxflow.cpp // initialization !std::vector<Energy::Var> vars(N); !Energy e; !!// add a node !vars[i] = e.add_variable(); !!// add the unary term for a node !e.add_term1(vars[i], u0, u1); !// add the pairwise term for an edge !e.add_term2(vars[i], vars[j], 0, vij, vij, 0); !!// perform energy minimization !Energy::TotalValue mnE = e.minimize(); !!// get new labels !if (e->get_var(vars[i])) label[i] = 1; !else label[i] = 0; !
6-‐May-‐13 15
Initialisation• User initialises trimap T by supplying only TB. The fore-ground is set to TF = /0; TU = TB, complement of the back-ground.
• Initialise αn = 0 for n ! TB and αn = 1 for n ! TU .• Background and foreground GMMs initialised from setsαn = 0 and αn = 1 respectively.
Iterative minimisation1. Assign GMM components to pixels: for each n in TU ,
kn := argminkn
Dn(αn,kn,θ ,zn).
2. Learn GMM parameters from data z:θ := argmin
θU(α,k,θ ,z)
3. Estimate segmentation: use min cut to solve:min
{αn: n!TU}minkE(α,k,θ ,z).
4. Repeat from step 1, until convergence.
5. Apply border matting (section 4).
User editing• Edit: fix some pixels either to αn = 0 (background brush)or αn = 1 (foreground brush); update trimap T accord-ingly. Perform step 3 above, just once.
• Refine operation: [optional] perform entire iterative min-imisation algorithm.
Figure 3: Iterative image segmentation in GrabCut
1 4 8 12
Energy E
RED
GREE
N
RED
GREE
N
(a) (b) (c)Figure 4: Convergence of iterative minimization for the data offig. 2f. (a) The energy E for the llama example converges over12 iterations. The GMM in RGB colour space (side-view showingR,G) at initialization (b) and after convergence (c). K = 5 mixturecomponents were used for both background (red) and foreground(blue). Initially (b) both GMMs overlap considerably, but are bet-ter separated after convergence (c), as the foreground/backgroundlabelling has become accurate.
3.3 User Interaction and incomplete trimaps
Incomplete trimaps. The iterative minimisation algorithm al-lows increased versatility of user interaction. In particular, incom-plete labelling becomes feasible where, in place of the full trimapT , the user needs only specify, say, the background region TB, leav-ing TF = 0. No hard foreground labelling is done at all. Iterativeminimisation (fig. 3) deals with this incompleteness by allowingprovisional labels on some pixels (in the foreground) which cansubsequently be retracted; only the background labels TB are takento be firm— guaranteed not to be retracted later. (Of course a com-plementary scheme, with firm labels for the foreground only, is alsoa possibility.) In our implementation, the initial TB is determined bythe user as a strip of pixels around the outside of the marked rect-angle (marked in red in fig. 2f).
Automatic�
Segmentation�
Automatic�
Segmentation�
U�s�e�r�
I�n�t�e�r�a�c�t�i�o�n�
Figure 5: User editing. After the initial user interaction and seg-mentation (top row), further user edits (fig. 3) are necessary. Mark-ing roughly with a foreground brush (white) and a backgroundbrush (red) is sufficient to obtain the desired result (bottom row).
Further user editing. The initial, incomplete user-labelling is of-ten sufficient to allow the entire segmentation to be completed au-tomatically, but by no means always. If not, further user editingis needed [Boykov and Jolly 2001], as shown in fig.5. It takes theform of brushing pixels, constraining them either to be firm fore-ground or firm background; then the minimisation step 3. in fig. 3is applied. Note that it is sufficient to brush, roughly, just part of awrongly labeled area. In addition, the optional “refine” operation offig. 3 updates the colour models, following user edits. This prop-agates the effect of edit operations which is frequently beneficial.Note that for efficiency the optimal flow, computed by Graph Cut,can be re-used during user edits.
4 Transparency
Given that a matting tool should be able to produce continuous al-pha values, we now describe a mechanism by which hard segmenta-tion, as described above, can be augmented by “border matting”, inwhich full transparency is allowed in a narrow strip around the hardsegmentation boundary. This is sufficient to deal with the problemof matting in the presence of blur and mixed pixels along smoothobject boundaries. The technical issues are: Estimating an alpha-map for the strip without generating artefacts, and recovering theforeground colour, free of colour bleeding from the background.
4.1 Border Matting
Border matting begins with a closed contourC, obtained by fitting apolyline to the segmentation boundary from the iterative hard seg-mentation of the previous section. A new trimap {TB,TU ,TF} iscomputed, in which TU is the set of pixels in a ribbon of width ±wpixels either side of C (we use w = 6). The goal is to compute themap αn, n ! TU , and in order to do this robustly, a strong modelis assumed for the shape of the α-profile within TU . The form ofthe model is based on [Mortensen and Barrett 1999] but with twoimportant additions: regularisation to enhance the quality of the es-timated α-map; and a dynamic programming (DP) algorithm forestimating α throughout TU .Let t = 1, . . . ,T be a parameterization of contourC, periodic with
period T , as curve C is closed. An index t(n) is assigned to eachpixel n ! TU , as in fig. 6(c). The α-profile is taken to be a soft step-function g (fig. 6c): αn = g
!
rn;Δt(n),σt(n)"
, where rn is a signeddistance from pixel n to contour C. Parameters Δ,σ determine thecentre and width respectively of the transition from 0 to 1 in the
minkE(α,k,θ, z) =
D(αn,θ, zn )n∑ +γ wn,m[αn ≠αm ]
n,m∑
vij = γ exp(−β zi − zj2)
Lecture 1 - !!!
Philipp Krähenbühl! 6-‐May-‐13 16
Initialisation• User initialises trimap T by supplying only TB. The fore-ground is set to TF = /0; TU = TB, complement of the back-ground.
• Initialise αn = 0 for n ! TB and αn = 1 for n ! TU .• Background and foreground GMMs initialised from setsαn = 0 and αn = 1 respectively.
Iterative minimisation1. Assign GMM components to pixels: for each n in TU ,
kn := argminkn
Dn(αn,kn,θ ,zn).
2. Learn GMM parameters from data z:θ := argmin
θU(α,k,θ ,z)
3. Estimate segmentation: use min cut to solve:min
{αn: n!TU}minkE(α,k,θ ,z).
4. Repeat from step 1, until convergence.
5. Apply border matting (section 4).
User editing• Edit: fix some pixels either to αn = 0 (background brush)or αn = 1 (foreground brush); update trimap T accord-ingly. Perform step 3 above, just once.
• Refine operation: [optional] perform entire iterative min-imisation algorithm.
Figure 3: Iterative image segmentation in GrabCut
1 4 8 12
Energy E
RED
GREE
N
RED
GREE
N
(a) (b) (c)Figure 4: Convergence of iterative minimization for the data offig. 2f. (a) The energy E for the llama example converges over12 iterations. The GMM in RGB colour space (side-view showingR,G) at initialization (b) and after convergence (c). K = 5 mixturecomponents were used for both background (red) and foreground(blue). Initially (b) both GMMs overlap considerably, but are bet-ter separated after convergence (c), as the foreground/backgroundlabelling has become accurate.
3.3 User Interaction and incomplete trimaps
Incomplete trimaps. The iterative minimisation algorithm al-lows increased versatility of user interaction. In particular, incom-plete labelling becomes feasible where, in place of the full trimapT , the user needs only specify, say, the background region TB, leav-ing TF = 0. No hard foreground labelling is done at all. Iterativeminimisation (fig. 3) deals with this incompleteness by allowingprovisional labels on some pixels (in the foreground) which cansubsequently be retracted; only the background labels TB are takento be firm— guaranteed not to be retracted later. (Of course a com-plementary scheme, with firm labels for the foreground only, is alsoa possibility.) In our implementation, the initial TB is determined bythe user as a strip of pixels around the outside of the marked rect-angle (marked in red in fig. 2f).
Automatic�
Segmentation�
Automatic�
Segmentation�
U�s�e�r�
I�n�t�e�r�a�c�t�i�o�n�
Figure 5: User editing. After the initial user interaction and seg-mentation (top row), further user edits (fig. 3) are necessary. Mark-ing roughly with a foreground brush (white) and a backgroundbrush (red) is sufficient to obtain the desired result (bottom row).
Further user editing. The initial, incomplete user-labelling is of-ten sufficient to allow the entire segmentation to be completed au-tomatically, but by no means always. If not, further user editingis needed [Boykov and Jolly 2001], as shown in fig.5. It takes theform of brushing pixels, constraining them either to be firm fore-ground or firm background; then the minimisation step 3. in fig. 3is applied. Note that it is sufficient to brush, roughly, just part of awrongly labeled area. In addition, the optional “refine” operation offig. 3 updates the colour models, following user edits. This prop-agates the effect of edit operations which is frequently beneficial.Note that for efficiency the optimal flow, computed by Graph Cut,can be re-used during user edits.
4 Transparency
Given that a matting tool should be able to produce continuous al-pha values, we now describe a mechanism by which hard segmenta-tion, as described above, can be augmented by “border matting”, inwhich full transparency is allowed in a narrow strip around the hardsegmentation boundary. This is sufficient to deal with the problemof matting in the presence of blur and mixed pixels along smoothobject boundaries. The technical issues are: Estimating an alpha-map for the strip without generating artefacts, and recovering theforeground colour, free of colour bleeding from the background.
4.1 Border Matting
Border matting begins with a closed contourC, obtained by fitting apolyline to the segmentation boundary from the iterative hard seg-mentation of the previous section. A new trimap {TB,TU ,TF} iscomputed, in which TU is the set of pixels in a ribbon of width ±wpixels either side of C (we use w = 6). The goal is to compute themap αn, n ! TU , and in order to do this robustly, a strong modelis assumed for the shape of the α-profile within TU . The form ofthe model is based on [Mortensen and Barrett 1999] but with twoimportant additions: regularisation to enhance the quality of the es-timated α-map; and a dynamic programming (DP) algorithm forestimating α throughout TU .Let t = 1, . . . ,T be a parameterization of contourC, periodic with
period T , as curve C is closed. An index t(n) is assigned to eachpixel n ! TU , as in fig. 6(c). The α-profile is taken to be a soft step-function g (fig. 6c): αn = g
!
rn;Δt(n),σt(n)"
, where rn is a signeddistance from pixel n to contour C. Parameters Δ,σ determine thecentre and width respectively of the transition from 0 to 1 in the
vij = γ exp(−β zi − zj2)
Lecture 1 - !!!
Philipp Krähenbühl! 6-‐May-‐13 17
Initialisation• User initialises trimap T by supplying only TB. The fore-ground is set to TF = /0; TU = TB, complement of the back-ground.
• Initialise αn = 0 for n ! TB and αn = 1 for n ! TU .• Background and foreground GMMs initialised from setsαn = 0 and αn = 1 respectively.
Iterative minimisation1. Assign GMM components to pixels: for each n in TU ,
kn := argminkn
Dn(αn,kn,θ ,zn).
2. Learn GMM parameters from data z:θ := argmin
θU(α,k,θ ,z)
3. Estimate segmentation: use min cut to solve:min
{αn: n!TU}minkE(α,k,θ ,z).
4. Repeat from step 1, until convergence.
5. Apply border matting (section 4).
User editing• Edit: fix some pixels either to αn = 0 (background brush)or αn = 1 (foreground brush); update trimap T accord-ingly. Perform step 3 above, just once.
• Refine operation: [optional] perform entire iterative min-imisation algorithm.
Figure 3: Iterative image segmentation in GrabCut
1 4 8 12
Energy E
RED
GREE
N
RED
GREE
N
(a) (b) (c)Figure 4: Convergence of iterative minimization for the data offig. 2f. (a) The energy E for the llama example converges over12 iterations. The GMM in RGB colour space (side-view showingR,G) at initialization (b) and after convergence (c). K = 5 mixturecomponents were used for both background (red) and foreground(blue). Initially (b) both GMMs overlap considerably, but are bet-ter separated after convergence (c), as the foreground/backgroundlabelling has become accurate.
3.3 User Interaction and incomplete trimaps
Incomplete trimaps. The iterative minimisation algorithm al-lows increased versatility of user interaction. In particular, incom-plete labelling becomes feasible where, in place of the full trimapT , the user needs only specify, say, the background region TB, leav-ing TF = 0. No hard foreground labelling is done at all. Iterativeminimisation (fig. 3) deals with this incompleteness by allowingprovisional labels on some pixels (in the foreground) which cansubsequently be retracted; only the background labels TB are takento be firm— guaranteed not to be retracted later. (Of course a com-plementary scheme, with firm labels for the foreground only, is alsoa possibility.) In our implementation, the initial TB is determined bythe user as a strip of pixels around the outside of the marked rect-angle (marked in red in fig. 2f).
Automatic�
Segmentation�
Automatic�
Segmentation�
U�s�e�r�
I�n�t�e�r�a�c�t�i�o�n�
Figure 5: User editing. After the initial user interaction and seg-mentation (top row), further user edits (fig. 3) are necessary. Mark-ing roughly with a foreground brush (white) and a backgroundbrush (red) is sufficient to obtain the desired result (bottom row).
Further user editing. The initial, incomplete user-labelling is of-ten sufficient to allow the entire segmentation to be completed au-tomatically, but by no means always. If not, further user editingis needed [Boykov and Jolly 2001], as shown in fig.5. It takes theform of brushing pixels, constraining them either to be firm fore-ground or firm background; then the minimisation step 3. in fig. 3is applied. Note that it is sufficient to brush, roughly, just part of awrongly labeled area. In addition, the optional “refine” operation offig. 3 updates the colour models, following user edits. This prop-agates the effect of edit operations which is frequently beneficial.Note that for efficiency the optimal flow, computed by Graph Cut,can be re-used during user edits.
4 Transparency
Given that a matting tool should be able to produce continuous al-pha values, we now describe a mechanism by which hard segmenta-tion, as described above, can be augmented by “border matting”, inwhich full transparency is allowed in a narrow strip around the hardsegmentation boundary. This is sufficient to deal with the problemof matting in the presence of blur and mixed pixels along smoothobject boundaries. The technical issues are: Estimating an alpha-map for the strip without generating artefacts, and recovering theforeground colour, free of colour bleeding from the background.
4.1 Border Matting
Border matting begins with a closed contourC, obtained by fitting apolyline to the segmentation boundary from the iterative hard seg-mentation of the previous section. A new trimap {TB,TU ,TF} iscomputed, in which TU is the set of pixels in a ribbon of width ±wpixels either side of C (we use w = 6). The goal is to compute themap αn, n ! TU , and in order to do this robustly, a strong modelis assumed for the shape of the α-profile within TU . The form ofthe model is based on [Mortensen and Barrett 1999] but with twoimportant additions: regularisation to enhance the quality of the es-timated α-map; and a dynamic programming (DP) algorithm forestimating α throughout TU .Let t = 1, . . . ,T be a parameterization of contourC, periodic with
period T , as curve C is closed. An index t(n) is assigned to eachpixel n ! TU , as in fig. 6(c). The α-profile is taken to be a soft step-function g (fig. 6c): αn = g
!
rn;Δt(n),σt(n)"
, where rn is a signeddistance from pixel n to contour C. Parameters Δ,σ determine thecentre and width respectively of the transition from 0 to 1 in the
Lecture 1 - !!!
Philipp Krähenbühl!
OpFmizaFons
• Use MEX files for Graph Cut – Call C/C++ code from matlab – complile: “mex a.cpp b.cpp c.cpp …”
• Vectorize – f: Nx3 matrix of RGB color values – a: N-‐dimensional binary vector (segmentaFon) – f(a==1,:): foreground features – f(a==0,:): background features
6-‐May-‐13 18
Lecture 1 - !!!
Philipp Krähenbühl!
ImplementaFon QuesFons?
6-‐May-‐13 19
Lecture 1 - !!!
Philipp Krähenbühl!
Extensions
• User interacFon or border manng • Play with GMMs – Vary number of components – Different iniFalizaFon – Different color space (Lab)
• Different Color model – Histogram based model
6-‐May-‐13 20
Lecture 1 - !!!
Philipp Krähenbühl!
Extensions • Different Neighborhood System – 4 connected – 8 connected – fully connected (DenseCRF)
• Efficient Inference in Fully Connected CRFs with Gaussian Edge PotenFals [Krähenbühl and Koltun 2011]
• Different “affinity” – Lab color difference – Contour detector gPb
• Contour DetecFon and Hierarchical Image SegmentaFon [Arbelaez etal 2010]
6-‐May-‐13 21
Lecture 1 - !!!
Philipp Krähenbühl!
QuesFons?
6-‐May-‐13 22