如何使用Python将热图图像数字化(从中提取数据)? [英] How to digitize (extract data from) a heat map image using Python?

查看:174
本文介绍了如何使用Python将热图图像数字化(从中提取数据)?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

有几种软件包可用于对折线图进行数字化处理,例如 GetData图形数字化仪.
但是,对于热图的数字化,我找不到任何程序包或程序.

我想使用Python数字化热图(来自png或jpg格式的图像).怎么做?
我需要从头开始编写整个代码吗?
还是有可用的软件包?

There are several packages available to digitize the line graphs e.g. GetData Graph Digitizer.
However, for digitzation of heat maps I could not find any packages or programs.

I want to digitize the heat map (images from png or jpg format) using Python. How to do it?
Do I need to write the entire code from scratch?
Or there are any packages available?

推荐答案

有多种实现方法,许多机器学习库提供了自定义可视化功能……更加轻松或更困难.

There are multiple ways to do it, many Machine Learning libraries offering custom visualization functions...easier or harder.

您需要将问题分成两半.

You need to split the problem in half.

首先,对于python或scikit-image使用OpenCV,首先必须将图像作为矩阵加载.您可以设置一些偏移量,以从单元格的开头开始.

First, using OpenCV for python or scikit-image you first have to load the images as matrices. You can set some offsets to start right at the beginning of the cells.

import cv2    
# 1 - read color image (3 color channels)
image = cv2.imread('test.jpg',1)

然后,您将遍历单元格并读取其中的颜色.您可以根据需要将结果标准化.我们引入一些偏移的原因是,热图不是从原始图像的左上角(0,0)开始. offset_x和offset_y将是每个都有2个值的列表.

Then, you will iterate thru the cells and read the color inside. You can normalise the result if you want. The reason we're introducing some offsets is because the heatmap doesn't start in the top left corner of the original image at (0,0). The offset_x and offset_y will be lists with 2 values each.

  • offset_x [0] :从图片左侧到热图开始处的偏移量(即start_of_heatmap_x)
  • offset_x [1] :从图片右侧到热图结束处的偏移量(即image_width-end_of_heatmap_x)
  • offset_y [0] :从图片顶部到热图开始的偏移量(即start_of_heatmap_y)
  • offset_y [1] :从图像底部到热图结束处的偏移量(即image_height-end_of_heatmap_y)
  • offset_x[0]: the offset from the left part of the image up to the beginning of the heatmap(i.e. start_of_heatmap_x)
  • offset_x[1]: the offset from the right part of the image up to the ending of the heatmap(i.e. image_width - end_of_heatmap_x)
  • offset_y[0]: the offset from the top part of the image up to the beggining of the heatmap(i.e. start_of_heatmap_y)
  • offset_y[1]: the offset from the bottom part of the image up to the ending of the heatmap (i.e. image_height - end_of_heatmap_y)

此外,我们不会重复到最后一列.这是因为我们从第0个"列开始,然后在每个基本局部坐标上添加cell_size/2以获得该单元格的中心值.

Also, we don't iterate up to the last column. That's because we start from the "0-th" column and we add cell_size/2 on each base local coordinates to obtain the center value of the cell.

def read_as_digital(image, cell_size, offset_x, offset_y):
    # grab the image dimensions
    h = image.shape[0]
    w = image.shape[1]
    results = []
    # loop over the image, cell by cell 
    for y in range(offset_y[0], h-offset_y[1]-cell_size, cell_size):
       row = []
       for x in range(offset_x[0], w-offset_x[0]-cell_size, cell_size):
            # append heatmap cell color to row
            row.append(image[x+int(cell_size/2),y+int(cell_size/2)])
       results.append(row)

    # return the thresholded image
    return results

提取图例信息并不困难,因为我们可以通过限制来导出值(尽管这适用于线性刻度).

Extracting the legend information is not hard because we can derive the values by having the limits (although this applies for linear scales).

因此,例如,我们可以从x和y得出图例上的台阶.

So for example, we can derive the step on the legends (from x and y).

def generate_legend(length, offset, cell_size, legend_start, legend_end):
    nr_of_cells = (length- offset[0] - offset[1])/cell_size
    step_size = (legend_end - legend_start)/nr_of_cells
    i=legend_start+step_size/2  # a little offset to center on the cell

    values = []
    while(i<legend_end):
        values.append(i)
        i = i+step_size
    return values

然后,您想可视化它们以查看是否正确完成了所有操作.例如,使用seaborn很容易 [1] .如果您想对任何事情进行更多控制,则可以使用scikit learning和matplotlib [2] .

Then you want to visualize them to see if everything was done right. For example, with seaborn it's very easy [1]. If you want more control, over...anything, you can use scikit learn and matplotlib [2].

这篇关于如何使用Python将热图图像数字化(从中提取数据)?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆