如何使用目录中存在的许多数据集训练深度学习模型 [英] how to train a deep learning model with many data sets present inside the directories
问题描述
专家我需要训练一个模型,该模型具有保存在目录train_data
和valid_data
中的许多数据集.相应目录中的每个数据都是一个numpy数组,行数为456,列为55.另外,我总共有100个训练数据文件和20个用于验证的文件.这两个目录中的每个文件都包含数据clean(data1)和嘈杂的数据(data2)在单个.npz文件中.以下是我的生成器代码,但它不能正确地训练模型...任何人都可以帮助我找出问题所在..
Experts i need to train a model with many data sets saved in the directories train_data
and valid_data
. Each data in the corresponding directories are a numpy array having rows=456 and columns 55.Additionally i have total 100 number of training data files and 20 numbers are for validation.Here each file in both directories contain data clean(data1) and data noisy(data2) in single .npz file.Below is my generator code code but it doesn't help on training model properly...can anybody help me on finding out where the problem lies ..
def tf_train_generator(file_list, batch_size = 256):
i = 0
while True:
if i*batch_size >= len(file_list):
i = 0
np.random.shuffle(file_list)
else:
file_chunk = file_list[i*batch_size:(i+1)*batch_size]
print(len(file_chunk))
for file in file_chunk:
print(file)
temp = np.load(file)
X = temp['data1']
Y= temp['data2']
i = i + 1
yield X, Y
推荐答案
如果它类似于npz
图片,则可以使用ImageDataGenerator
If it's npz
Image-like, you can use ImageDataGenerator
ImageDataGenerator. It supports both from_directory
and from_dataframe
.
这篇关于如何使用目录中存在的许多数据集训练深度学习模型的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!