快速将标记图像转换为{label:[coordinates]}的字典 [英] Fast way to turn a labeled image into a dictionary of { label : [coordinates] }

查看:297
本文介绍了快速将标记图像转换为{label:[coordinates]}的字典的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

说我用标记了一张图片scipy.ndimage.measurements.label 如下:

[[0, 1, 0, 0, 0, 0],
 [0, 1, 0, 0, 0, 0],
 [0, 1, 0, 0, 0, 0],
 [0, 0, 0, 0, 3, 0],
 [2, 2, 0, 0, 0, 0],
 [2, 2, 0, 0, 0, 0]]

收集属于每个标签的坐标的快速方法是什么?即类似于:

What's a fast way to collect the coordinates belonging to each label? I.e. something like:

{ 1: [[0, 1], [1, 1], [2, 1]],
  2: [[4, 0], [4, 1], [5, 0], [5, 1]],
  3: [[3, 4]] }

我正在处理大小约为15,000 x 5000像素的图像,并且大约有一半图像的像素被标记(即非零)。

I'm working with images that are ~15,000 x 5000 pixels in size, and roughly half of each image's pixels are labeled (i.e. non-zero).

不是用 nditer 迭代整个图像,它会更快到对每个标签执行类似 np.where(img == label)的操作?

Rather than iterating through the entire image with nditer, would it be faster to do something like np.where(img == label) for each label?

编辑:

哪种算法最快取决于标记图像与其标记图像的数量相比有多大。 Warren Weckesser和Salvador Dali / BHAT IRSHAD的方法(基于 np.nonzero np.where )似乎都是与标签数量成线性比例,而用 nditer 迭代每个图像元素显然与标记图像的大小成线性比例。

Which algorithm is fastest depends on how big the labeled image is as compared to how many labels it has. Warren Weckesser and Salvador Dali / BHAT IRSHAD's methods (which are based on np.nonzero and np.where) all seem to scale linearly with the number of labels, whereas iterating through each image element with nditer obviously scales linearly with the size of labeled image.

小测试的结果:

size: 1000 x 1000, num_labels: 10
weckesser ... 0.214357852936s 
dali ... 0.650229930878s 
nditer ... 6.53645992279s 


size: 1000 x 1000, num_labels: 100
weckesser ... 0.936990022659s 
dali ... 1.33582305908s 
nditer ... 6.81486487389s 


size: 1000 x 1000, num_labels: 1000
weckesser ... 8.43906402588s 
dali ... 9.81333303452s 
nditer ... 7.47897100449s 


size: 1000 x 1000, num_labels: 10000
weckesser ... 100.405524015s 
dali ... 118.17239809s 
nditer ... 9.14583897591s

因此问题变得更加具体:

So the question becomes more specific:

对于标签图像,其中标签数量为 sqrt(尺寸(图像) ))是否有一种算法来收集标签坐标,这比迭代每个图像元素要快(即 nditer )?

For labeled images in which the number of labels is on the order of sqrt(size(image)) is there an algorithm to gather label coordinates that is faster than iterating through every image element (i.e. with nditer)?

推荐答案

这是一种可能性:

import numpy as np

a = np.array([[0, 1, 0, 0, 0, 0],
              [0, 1, 0, 0, 0, 0],
              [0, 1, 0, 0, 0, 0],
              [0, 0, 0, 0, 3, 0],
              [2, 2, 0, 0, 0, 0],
              [2, 2, 0, 0, 0, 0]])

# If the array was computed using scipy.ndimage.measurements.label, you
# already know how many labels there are.
num_labels = 3

nz = np.nonzero(a)
coords = np.column_stack(nz)
nzvals = a[nz[0], nz[1]]
res = {k:coords[nzvals == k] for k in range(1, num_labels + 1)}

我调用了这个脚本 get_label_indices.py 。这是一个示例运行:

I called this script get_label_indices.py. Here's a sample run:

In [97]: import pprint

In [98]: run get_label_indices.py

In [99]: pprint.pprint(res)
{1: array([[0, 1],
       [1, 1],
       [2, 1]]),
 2: array([[4, 0],
       [4, 1],
       [5, 0],
       [5, 1]]),
 3: array([[3, 4]])}

这篇关于快速将标记图像转换为{label:[coordinates]}的字典的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆