Spark如何使用图像格式读取图像? [英] How is Spark reading my image using the image format?

查看:291
本文介绍了Spark如何使用图像格式读取图像?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

这可能是一个愚蠢的问题,但我无法弄清楚Spark如何使用 spark.read.format("image").load(....)参数读取图像.

It might be a silly question but I can't figure out how Spark read my image using the spark.read.format("image").load(....) argument.

导入我的图像后,可以得到以下信息:

After importing my image which gives me the following:

>>> image_df.select("image.height","image.width","image.nChannels", "image.mode", "image.data").show()
+------+-----+---------+----+--------------------+
|height|width|nChannels|mode|                data|
+------+-----+---------+----+--------------------+
|   430|  470|        3|  16|[4D 55 4E 4C 54 4...|
+------+-----+---------+----+--------------------+

我得出的结论是:

  • 我的图像是430x470像素
  • 我的图像是彩色的(由于nChannels = 3而产生了RGB),这是一种openCV兼容类型,
  • 我的图像模式是16,对应于特定的openCV字节顺序.
    • 有人知道我可以浏览哪个网站/文档以了解更多信息吗?
    • my image is 430x470 pixels,
    • my image is colored (RGB due to nChannels = 3) which is an openCV compatible-type,
    • my image mode is 16 which corresponds to a particular openCV byte-order.
      • Does someone knows which website/documentation I could browse to know more about it?
      • 当我运行 image_df.select("image.data").take(1)时,我得到的输出似乎只是一个数组(见下文).
      • when I run image_df.select("image.data").take(1) I got an output which seems to be only one array (see below).
      >>> image_df.select("image.data").take(1)
      
      # **1/** Here are the last elements of the result
      ....<<One Eternity Later>>....x92\x89\x8a\x8d\x84\x86\x89\x80\x84\x87~'))]
      
      # 2/ I got also several part of the result which looks like:
      .....\x89\x80\x80\x83z|\x7fvz}tpsjqtkrulsvmsvmsvmrulrulrulqtkpsjnqhnqhmpgmpgmpgnqhnqhn
      qhnqhnqhnqhnqhnqhmpgmpgmpgmpgmpgmpgmpgmpgnqhnqhnqhnqhnqhnqhnqhnqhknejmdilcilchkbh
      kbilcilckneloflofmpgnqhorioripsjsvmsvmtwnvypx{ry|sz}t{~ux{ry|sy|sy|sy|sz}tz}tz}tz}
      ty|sy|sy|sy|sz}t{~u|\x7fv|\x7fv}.....
      
      

      接下来将链接到上面显示的结果.那可能是由于我缺乏有关openCV的知识(或其他原因).尽管如此:

      What come next are linked to the results displayed above. Those might be due to my lack of knowledge concerning openCV (or else). Nonetheless:

      • 1/我不明白以下事实:如果我得到RGB图像,我应该有3个矩阵,但是输出由 ....... \ x84 \x87〜'))] .我在想获得类似 [(...),(...),(... \ x87〜')] 之类的东西.
      • 2/这部分有特殊含义吗?像那些分隔符一样,是每个矩阵之间的分隔符吗?
      • 1/ I don't understand the fact that if I got an RGB image, I should have 3 matrix but the output finishes by .......\x84\x87~'))]. I was more thinking on obtaining something like [(...),(...),(...\x87~')].
      • 2/ Is this part has a special meaning? Like those are the separator between each matrix or something?

      要更清楚地了解我要实现的目标,我想处理图像以在每个图像之间进行像素比较.因此,我想知道图像中给定位置的像素值(我假设如果我有RGB图像,则给定位置应具有3个像素值).

      To be more clear about what I'm trying to achieve, I want to process images to do pixel comparison between each images. Therefore, I want to know the pixel values for a given position in my image (I assume that if I have an RGB image, I shall have 3 pixel values for a given position).

      示例:假设我有一个摄像头仅在白天指向天空,并且我想知道与左上角天空部分相对应的位置的像素值,我发现这些值是串联在一起的赋予颜色浅蓝色,表示照片是在晴天拍摄的.假设唯一的可能性是晴天时颜色为浅蓝色.
      接下来,我想将前一个串联与另一个像素值串联在完全相同的位置进行比较,但是要从第二天拍摄的照片中进行比较.如果我发现它们不相等,那么我得出结论,给定的照片是在阴天/雨天拍摄的.如果相等,则为晴天.

      Example: let's say that I have a webcam pointing to the sky only during the day and I want to know the values of a pixel at a position corresponding to the top left sky part, I found out that the concatenation of those values gives the colour Light Blue which says that the photo was taken on a sunny day. Let's say that the only possibility is that a sunny day takes the colour Light Blue.
      Next I want to compare the previous concatenation with another concat of pixel values at the exact same position but from a picture taken the next day. If I found out that they are not equal then I conclude that the given picture was taken on a cloudy/rainy day. If equal then sunny day.

      对此将提供任何帮助,我们将不胜感激.我对示例进行了粗俗化以更好地理解,但是我的目标几乎是相同的.我知道可以使用ML模型来实现这些目标,但是我很乐意首先尝试.我的第一个目标是将该列分为与每种颜色代码相对应的3列:红色矩阵,绿色矩阵,蓝色矩阵

      Any help on that would be highly appreciated. I have vulgarized my example for a better understanding but my goal is pretty much the same. I know that ML model can exist to achieve those stuff but I would be happy to try this first. My first goal is to split this column into 3 columns corresponding to each color code: a red matrix, a green matrix, a blue matrix

      推荐答案

      我认为我有逻辑.我使用keras.preprocessing.image.img_to_array()函数来了解如何对值进行分类(因为我具有RGB图像,所以我必须具有3个矩阵:每种颜色R G B一个).如果有人怀疑它是如何工作的,我可能会错了,但我认为我有一些东西:

      I think I have the logic. I used the keras.preprocessing.image.img_to_array() function to understand how the values are classified (since I have an RGB image, I must have 3 matrix: one for each color R G B). Posting that if someone wonder how it works, I might be wrong but I think I have something :

      from keras.preprocessing import image
      import numpy as np
      from PIL import Image
      
      # Using spark built-in data source
      first_img = spark.read.format("image").schema(imageSchema).load(".....")
      raw = first_img.select("image.data").take(1)[0][0]
      np.shape(raw)
      (606300,) # which is 470*430*3
      
      
      
      # Using keras function
      img = image.load_img(".../path/to/img")
      yy = image.img_to_array(img)
      >>> np.shape(yy)
      (430, 470, 3) # the form is good but I have a problem of order since:
      
      >>> raw[0], raw[1], raw[2]
      (77, 85, 78)
      >>> yy[0][0]
      array([78., 85., 77.], dtype=float32)
      
      # Therefore I used the numpy reshape function directly on raw 
      # to have 470 matrix of 3 lines and 470 columns:
      
      array = np.reshape(raw, (430,470,3))
      xx = image.img_to_array(array)     # OPTIONAL and not used here
      
      >>> array[0][0] == (raw[0],raw[1],raw[2])
      array([ True,  True,  True])
      
      >>> array[0][1] == (raw[3],raw[4],raw[5])
      array([ True,  True,  True])
      
      >>> array[0][2] == (raw[6],raw[7],raw[8])
      array([ True,  True,  True])
      
      >>> array[0][3] == (raw[9],raw[10],raw[11])
      array([ True,  True,  True])
      

      因此,如果我理解得很清楚,spark会将图像读取为一个大数组-(606300,)在这里-实际上,每个元素都是有序的,并对应于它们各自的色度(R G B).
      经过一些小的变换,我得到了3列x 470行的430矩阵.由于我的图像的宽度(HeightxHeight)为(470x430),因此每个矩阵对应于一个像素高度位置,并且在每个内部:每个颜色3列和每个宽度位置470行.

      So if I understood well, spark will read the image as a big array - (606300,) here - where in fact each element are ordered and corresponds to their respective color shade (R G B).
      After doing my little transformations, I obtain 430 matrix of 3 columns x 470 lines. Since my image is (470x430) for (WidthxHeight), each matrix corresponds to a pixel heigth position and inside each: 3 columns for each color and 470 lines for each width position.

      希望可以帮助某人:)!

      Hope that helps someone :)!

      这篇关于Spark如何使用图像格式读取图像?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆