Tensorflow删除JFIF [英] Tensorflow Removing JFIF

查看:262
本文介绍了Tensorflow删除JFIF的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我对tensorflow非常陌生,我想清楚地知道,以下命令是做什么的?

  import tensorflow从tensorflow导入keras的tf 
到tensorflow.keras导入层的

导入os

num_skipped = 0
for folder_name in( Cat,狗):
print( folder_name:,folder_name)#文件夹名称:猫
folder_path = os.path.join( Dataset / PetImages,folder_name)
print( folder_path: ,folder_path)#folder_path:os.listdir中的fname的数据集/ PetImages / Cat
(folder_path):
print( fname:,fname)#fname:5961.jpg
fpath = os.path.join(folder_path,fname)
print( fpath:,fpath)#fpath:Dataset / PetImages / Cat / 10591.jpg
try:
fobj = open( fpath, rb)
is_jfif = tf.compat.as_bytes( JFIF)in fobj.peek(10)
最后:
fobj.close()

如果不是is_jfif:
num_skipped + = 1
#删除损坏的图像
os.remove(fpath)

打印(已删除%d个图像%num_skipped)

Keras网站对上述代码的评论:



在处理大量在真实世界的图像数据中,损坏的图像是常见的情况。让我们过滤掉标题中不包含字符串 JFIF的错误编码的图像。



我想具体了解以下命令的作用,它是怎么做的?

  is_jfif = tf.compat.as_bytes( JFIF)in fobj.peek(10)

我检查了API,但显然无法理解。



更好的解释将有很大帮助。



谢谢

解决方案



因此:




  • tf.compat.as_bytes( JFIF)将字符串 JFIF转换为字节。您也可以只使用 b JFIF ,尽管也许TensorFlow实现具有一些我不知道的优化。

  • fobj.peek(10)理论上返回文件的前10个字节,但实际上,它通常返回整个文件

  • is_jfif 然后只检查转换后的 JFIF字符串是否是 fobj.peek 的结果。


I am quite new to tensorflow, I would like to clearly know, what does the below command do?

import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers
import os

num_skipped = 0
for folder_name in ("Cat", "Dog"):
    print("folder_name:",folder_name) #folder_name: Cat
    folder_path = os.path.join("Dataset/PetImages", folder_name)
    print("folder_path:",folder_path) #folder_path: Dataset/PetImages/Cat
    for fname in os.listdir(folder_path):
        print("fname:",fname) #fname: 5961.jpg
        fpath = os.path.join(folder_path, fname)
        print("fpath:", fpath) #fpath: Dataset/PetImages/Cat/10591.jpg
        try:
            fobj = open(fpath, "rb")
            is_jfif = tf.compat.as_bytes("JFIF") in fobj.peek(10)
        finally:
            fobj.close()

        if not is_jfif:
            num_skipped += 1
            # Delete corrupted image
            os.remove(fpath)

print("Deleted %d images" % num_skipped)

Keras Website comment on the above code :

When working with lots of real-world image data, corrupted images are a common occurence. Let's filter out badly-encoded images that do not feature the string "JFIF" in their header.

I want to specifically know what does the below command do, how does it do ?

 is_jfif = tf.compat.as_bytes("JFIF") in fobj.peek(10)

I checked the API but wasn't clearly able to understand it.

A better explanation will be of much help.

Thanks

解决方案

Wikipedia explains that JPG files contain the string "JFIF" at the beginning of the file, encoded as bytes:

So:

  • tf.compat.as_bytes("JFIF") converts the string "JFIF" to bytes. You could also just use b"JFIF", though maybe the TensorFlow implementation has some optimization I don't know about.
  • fobj.peek(10) theoretically returns the first 10 bytes of the file, but in practice it often returns the entire file.
  • is_jfif then just checks if the converted "JFIF" string is in the result of fobj.peek.

这篇关于Tensorflow删除JFIF的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆