为Keras编写自定义数据生成器 [英] Write custom Data Generator for Keras
本文介绍了为Keras编写自定义数据生成器的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我将每个数据点都存储在.npy文件中,并带有shape=(1024,7,8)
.我想通过类似于ImageDataGenerator
的方式将它们加载到Keras模型中,所以我编写并尝试了不同的自定义生成器,但是它们都不起作用,这是我从
I have each datapoint stored in a .npy file, with shape=(1024,7,8)
. I want to load them to a Keras model by a manner similar to ImageDataGenerator
, so I wrote and tried different custom generators but none of them work, here is one I adapted from this
def find(dirpath, prefix=None, suffix=None, recursive=True):
"""Function to find recursively all files with specific prefix and suffix in a directory
Return a list of paths
"""
l = []
if not prefix:
prefix = ''
if not suffix:
suffix = ''
for (folders, subfolders, files) in os.walk(dirpath):
for filename in [f for f in files if f.startswith(prefix) and f.endswith(suffix)]:
l.append(os.path.join(folders, filename))
if not recursive:
break
l
return l
def generate_data(directory, batch_size):
i = 0
file_list = find(directory)
while True:
array_batch = []
for b in range(batch_size):
if i == len(file_list):
i = 0
random.shuffle(file_list)
sample = file_list[i]
i += 1
array = np.load(sample)
array_batch.append(array)
yield array_batch
我发现缺少标签,因此无法使用fit_generator
将其放入模型中.假设我可以将标签存储在numpy数组中,如何将其添加到此生成器中?
I found this lacks of the label, so it won't be fit into the model using fit_generator
. How can I add the label into this generator, given that I can store them in a numpy array?
推荐答案
from tensorflow.python.keras.utils import Sequence
import numpy as np
class Mygenerator(Sequence):
def __init__(self, x_set, y_set, batch_size):
self.x, self.y = x_set, y_set
self.batch_size = batch_size
def __len__(self):
return int(np.ceil(len(self.x) / float(self.batch_size)))
def __getitem__(self, idx):
batch_x = self.x[idx * self.batch_size:(idx + 1) * self.batch_size]
batch_y = self.y[idx * self.batch_size:(idx + 1) * self.batch_size]
# read your data here using the batch lists, batch_x and batch_y
x = [my_readfunction(filename) for filename in batch_x]
y = [my_readfunction(filename) for filename in batch_y]
return np.array(x), np.array(y)
这篇关于为Keras编写自定义数据生成器的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文