使用Matlab切割图形 [英] Graph cut using Matlab

查看:154
本文介绍了使用Matlab切割图形的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试为我的分割任务应用图形切割方法。我在
找到了一些示例代码 Graph_Cut_Demo



部分代码显示如下

  img = im2double imread([ImageDir'cat.jpg'])); 
[ny,nx,nc] = size(img);
d =重塑(img,ny * nx,nc);
k = 2; %群集数量
[l0 c] = kmeans(d,k);
l0 =重塑(l0,ny,nx);

%对于每个类,数据项Dc测量每个像素值
%到类原型的距离。为了简单起见,使用标准的
%欧几里德距离。在某些情况下,Mahalanobis距离(按类别
%协方差加权)可能会改善结果。请注意,
%图像强度值位于[0,1]区间内,它提供了
%的归一化。

Dc =零(ny,nx,k);
for i = 1:k
dif = d - repmat(c(i,:),ny * nx,1);
Dc(:,:,i)= reshape(sum(dif。^ 2,2),ny,nx);
end

似乎该方法使用k-means聚类来初始化图并得到数据项Dc。但是,我不明白他们是如何计算这个数据项的。为什么他们使用

  dif = d  -  repmat(c(i,:),ny * nx,1); 

在您的评论中,数据项 Dc 测量每个像素值到类原型的距离。什么是类原型,以及为什么可以通过k-means标签来确定?

在另一个实现中 Graph_Cut_Demo2 ,它使用

 %计算每个集群中心的数据成本
Dc = zeros( [sz(1:2)k],'single');
for ci = 1:k
%使用每个群集的协方差矩阵
icv = inv(cov(d(l0 == ci,:)));
dif = d- repmat(c(ci,:),[size(d,1)1]);
%数据成本是根据其RGB值
Dc(:,:,ci)= reshape(sum((dif * icv))的像素属于每个
% )* dif./2,2),sz(1:2));
end

这让我很困惑。为什么他们计算协方差矩阵以及他们如何使用负对数似然形成数据项?任何文件或描述可用于这些实现?



非常感谢。

解决方案

两个图形切割分割示例都是强烈关联的。 图像处理,分析和机器视觉:MATLAB伴侣 的作者书(第一个例子)使用了 Shai Bagon < a>(自然得到作者的许可) - 第二个例子。

那么,数据项又是什么?

数据项表示如何每个像素独立可能属于每个标签。这就是为什么使用对数似然项的原因。更具体地说,在这些例子中,你试图将图像分割成 k 基于它们的颜色的细分。
您假设图像中只有 k 主色(不是一个非常实用的假设,但足以用于教育目的)。
使用 k - 表示您试着找出这两种颜色。在RGB空间中 k -means的输出是 k 中心 - 也就是 k 有代表性的颜色。
每个像素属于任何 k 中心的可能性与来自代表<$ c的像素距离(在色彩空间中)成反比$ c> k
- 第一个中心:距离越大,属于 k 中心的像素越不可能,则一元能量损失必须支付才能将此像素分配给 k -th群集。

第二个示例将此概念提前一步,并假设 k 簇可能在颜色空间中具有不同的密度,使用每个簇的协方差矩阵对此二阶行为建模。



<实际上,每个分段都使用更复杂的颜色模型,通常是高斯混合物。你可以在
的开创性论文中阅读它GrabCut (第3节)。



PS,

下次您可以直接发送电子邮件给Shai Bagon并询问。


I am trying to apply graph cut method for my segmentation task. I found some example codes at Graph_Cut_Demo.

Part of the codes are showing below

img = im2double( imread([ImageDir 'cat.jpg']) ); 
[ny,nx,nc] = size(img); 
d = reshape( img, ny*nx, nc );  
k = 2; % number of clusters 
[l0 c] = kmeans( d, k ); 
l0 = reshape( l0, ny, nx ); 

% For each class, the data term Dc measures the distance of 
% each pixel value to the class prototype. For simplicity, standard 
% Euclidean distance is used. Mahalanobis distance (weighted by class 
% covariances) might improve the results in some cases.  Note that the 
% image intensity values are in the [0,1] interval, which provides 
% normalization.   

Dc = zeros( ny, nx, k ); 
for i = 1:k 
  dif = d - repmat( c(i,:), ny*nx,1 ); 
  Dc(:,:,i) = reshape( sum(dif.^2,2), ny, nx ); 
end 

It seems that the method used k-means clustering to initialise the graph and get the data term Dc. However, I don't understand how they calculate this data term. Why they use

dif = d - repmat( c(i,:), ny*nx,1 ); 

In the comments thy said the data term Dc measures the distance of each pixel value to the class prototype. What is the class prototype, and why it can be determined by k-means label?

In another implementation Graph_Cut_Demo2, it used

% calculate the data cost per cluster center
Dc = zeros([sz(1:2) k],'single');
for ci=1:k
    % use covariance matrix per cluster
    icv = inv(cov(d(l0==ci,:)));    
    dif = d- repmat(c(ci,:), [size(d,1) 1]);
    % data cost is minus log likelihood of the pixel to belong to each
    % cluster according to its RGB value
    Dc(:,:,ci) = reshape(sum((dif*icv).*dif./2,2),sz(1:2));
end

This confused me a lot. Why they calculate the covariance matrix and how they formed the data term using minus log likelihood? Any papers or descriptions available for these implementation?

Thanks a lot.

解决方案

Both graph-cut segmentation examples are strongly related. The authors of Image Processing, Analysis, and Machine Vision: A MATLAB Companion book (first example) used the graph cut wrapper code of Shai Bagon (with the author's permission naturally) - the second example.

So, what is the data term anyway?
The data term represent how each pixel independently is likely to belong to each label. This is why log-likelihood terms are used.

More concretely, in these examples you try to segment the image into k segments based on their colors. You assume there are only k dominant colors in the image (not a very practical assumption, but sufficient for educational purposes). Using k-means you try and find what are these two colors. The output of k-means is k centers in RGB space - that is k "representative" colors. The likelihood of each pixel to belong to any of the k centers is inversely proportional to the distance (in color space) of the pixel from the representative k-th center: the larger the distance the less likely the pixel to belong to the k-th center, the higher the unary energy penalty one must "pay" to assign this pixel to the k-th cluster.
The second example takes this notion one step ahead and assumes that the k clusters may have different densities in color space, modeling this second order behavior using a covariance matrix for each cluster.

In practice, one uses a more sophisticated color model for each segment, usually a mixture of Gaussians. You can read about it in the seminal paper of GrabCut (section 3).

PS,
Next time you can email Shai Bagon directly and ask.

这篇关于使用Matlab切割图形的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆