Tensorflow删除JFIF [英] Tensorflow Removing JFIF
问题描述
我对tensorflow非常陌生,我想清楚地知道,以下命令是做什么的?
import tensorflow从tensorflow导入keras的tf
到tensorflow.keras导入层的
到
导入os
num_skipped = 0
for folder_name in( Cat,狗):
print( folder_name:,folder_name)#文件夹名称:猫
folder_path = os.path.join( Dataset / PetImages,folder_name)
print( folder_path: ,folder_path)#folder_path:os.listdir中的fname的数据集/ PetImages / Cat
(folder_path):
print( fname:,fname)#fname:5961.jpg
fpath = os.path.join(folder_path,fname)
print( fpath:,fpath)#fpath:Dataset / PetImages / Cat / 10591.jpg
try:
fobj = open( fpath, rb)
is_jfif = tf.compat.as_bytes( JFIF)in fobj.peek(10)
最后:
fobj.close()
如果不是is_jfif:
num_skipped + = 1
#删除损坏的图像
os.remove(fpath)
打印(已删除%d个图像%num_skipped)
Keras网站对上述代码的评论:
在处理大量在真实世界的图像数据中,损坏的图像是常见的情况。让我们过滤掉标题中不包含字符串 JFIF的错误编码的图像。
我想具体了解以下命令的作用,它是怎么做的?
is_jfif = tf.compat.as_bytes( JFIF)in fobj.peek(10)
我检查了API,但显然无法理解。
更好的解释将有很大帮助。
谢谢
因此:
-
tf.compat.as_bytes( JFIF)
将字符串 JFIF转换为字节。您也可以只使用b JFIF
,尽管也许TensorFlow实现具有一些我不知道的优化。 -
fobj.peek(10)
理论上返回文件的前10个字节,但实际上,它通常返回整个文件。 -
is_jfif
然后只检查转换后的 JFIF字符串是否是fobj.peek
的结果。
I am quite new to tensorflow, I would like to clearly know, what does the below command do?
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers
import os
num_skipped = 0
for folder_name in ("Cat", "Dog"):
print("folder_name:",folder_name) #folder_name: Cat
folder_path = os.path.join("Dataset/PetImages", folder_name)
print("folder_path:",folder_path) #folder_path: Dataset/PetImages/Cat
for fname in os.listdir(folder_path):
print("fname:",fname) #fname: 5961.jpg
fpath = os.path.join(folder_path, fname)
print("fpath:", fpath) #fpath: Dataset/PetImages/Cat/10591.jpg
try:
fobj = open(fpath, "rb")
is_jfif = tf.compat.as_bytes("JFIF") in fobj.peek(10)
finally:
fobj.close()
if not is_jfif:
num_skipped += 1
# Delete corrupted image
os.remove(fpath)
print("Deleted %d images" % num_skipped)
Keras Website comment on the above code :
When working with lots of real-world image data, corrupted images are a common occurence. Let's filter out badly-encoded images that do not feature the string "JFIF" in their header.
I want to specifically know what does the below command do, how does it do ?
is_jfif = tf.compat.as_bytes("JFIF") in fobj.peek(10)
I checked the API but wasn't clearly able to understand it.
A better explanation will be of much help.
Thanks
Wikipedia explains that JPG files contain the string "JFIF" at the beginning of the file, encoded as bytes:
So:
tf.compat.as_bytes("JFIF")
converts the string "JFIF" to bytes. You could also just useb"JFIF"
, though maybe the TensorFlow implementation has some optimization I don't know about.fobj.peek(10)
theoretically returns the first 10 bytes of the file, but in practice it often returns the entire file.is_jfif
then just checks if the converted "JFIF" string is in the result offobj.peek
.
这篇关于Tensorflow删除JFIF的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!