适用于Cloud ML Engine的Keras ImageDataGenerator [英] Keras ImageDataGenerator for Cloud ML Engine
问题描述
我需要训练由存储在GCloud Storage上的一些原始图像提供的神经网络.为此,我使用Keras图像生成器的 flow_from_directory 方法来查找存储中的所有图像及其相关标签.
I need to train a neural net fed by some raw images that I store on the GCloud Storage. To do that I’m using the flow_from_directory method of my Keras image generator to find all the images and their related labels on the storage.
training_data_directory = args.train_dir
testing_data_directory = args.eval_dir
training_gen = datagenerator.flow_from_directory(
training_data_directory,
target_size = (img_width, img_height),
batch_size = 32)
validation_gen = basic_datagen.flow_from_directory(
testing_data_directory,
target_size = (img_width, img_height),
batch_size = 32)
我的GCloud Storage架构如下:
My GCloud Storage architecture is the following :
扩桶/数据/火车
brad-bucket/数据/评估
brad-bucket / data / train
brad-bucket / data / eval
gsutil命令使我可以确定文件夹是否存在.
The gsutil command allows me to be sure my folders exist.
brad$ gsutil ls gs://brad-bucket/data/
gs://brad-bucket/data/eval/
gs://brad-bucket/data/train/
这是我正在运行的脚本,用于使用用于目录路径(train_dir,eval_dir)的字符串在ML Engine上启动培训.
So here is the script I'm running to launch the training on ML Engine with the strings I use for the paths of my directories (train_dir, eval_dir).
BUCKET="gs://brad-bucket"
JOB_ID="training_"$(date +%s)
JOB_DIR="gs://brad-bucket/jobs/train_keras_"$(date +%s)
TRAIN_DIR="gs://brad-bucket/data/train/"
EVAL_DIR="gs://brad-bucket/data/eval/"
CONFIG_PATH="config/config.yaml"
PACKAGE="trainer"
gcloud ml-engine jobs submit training $JOB_ID \
--stream-logs \
--verbosity debug \
--module-name trainer.task \
--staging-bucket $BUCKET \
--package-path $PACKAGE \
--config $CONFIG_PATH \
--region europe-west1 \
-- \
--job_dir $JOB_DIR \
--train_dir $TRAIN_DIR \
--eval_dir $EVAL_DIR \
--dropout_one 0.2 \
--dropout_two 0.2
尽管如此,我正在执行的操作会引发OSError.
Though, what I’m doing throws an OSError.
ERROR 2018-01-10 09:41:47 +0100 service File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/keras/_impl/keras/preprocessing/image.py", line 1086, in __init__
ERROR 2018-01-10 09:41:47 +0100 service for subdir in sorted(os.listdir(directory)):
ERROR 2018-01-10 09:41:47 +0100 service OSError: [Errno 2] No such file or directory: 'gs://brad-bucket/data/train/'
当我使用其他数据结构(以另一种方式读取数据)时,一切正常,但是当我使用 flow_from_directory 从目录中读取时和子目录,我总是会遇到同样的错误. 是否可以使用这种方法从Cloud Storage检索数据,还是必须以其他方式提供数据?
When I'm using another data structure (reading the data in another way), everything is working fine, but when I'm using flow_from_directory to read from directories and subdirectories I'm always getting this same error. Is it possible to use this method to retrieve data from the Cloud Storage or do I have to feed the data in a different way?
推荐答案
If you check the source code, you see that the error arises when Keras (or TF) is trying to construct the classes from your directories. Since you are giving it a GCS-directory (gs://
), this will not work. You can bypass this error by providing the classes argument yourself, e.g. in the following way:
def get_classes(file_dir):
if not file_dir.startswith("gs://"):
classes = [c.replace('/', '') for c in os.listdir(file_dir)]
else:
bucket_name = file_dir.replace('gs://', '').split('/')[0]
prefix = file_dir.replace("gs://"+bucket_name+'/', '')
if not prefix.endswith("/"):
prefix += "/"
client = storage.Client()
bucket = client.get_bucket(bucket_name)
iterator = bucket.list_blobs(delimiter="/", prefix=prefix)
response = iterator.get_next_page_response()
classes = [c.replace('/','') for c in response['prefixes']]
return classes
将这些类传递给flow_from_directory
将解决您的错误,但无法识别文件本身(例如,我现在得到了Found 0 images belonging to 2 classes.
).
Passing these classes to flow_from_directory
will solve your error, but it will not recognize the files itself (I now get e.g. Found 0 images belonging to 2 classes.
).
我发现的唯一直接"解决方法是将文件复制到本地磁盘,然后从那里读取它们.拥有另一种解决方案将是很棒的(例如,如果是图片,可能需要很长时间才能复制).
The only 'direct' workaround that I find, is to copy your files to local disk and read them from there. It would be great to have another solution (since e.g. in case of images, it can take long to copy).
其他资源还建议在与Cloud ML Engine中的GCS进行交互时使用TensorFlow的file_io
函数,但是在这种情况下,这将需要您完全重写flow_from_directory
.
Other resources also suggest to use TensorFlow's file_io
function when interacting with GCS from Cloud ML Engine, but this will require you to fully rewrite flow_from_directory
yourself in this case.
这篇关于适用于Cloud ML Engine的Keras ImageDataGenerator的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!