如何使用目录中存在的许多数据集训练深度学习模型 [英] how to train a deep learning model with many data sets present inside the directories

查看:104
本文介绍了如何使用目录中存在的许多数据集训练深度学习模型的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

专家我需要训练一个模型,该模型具有保存在目录train_datavalid_data中的许多数据集.相应目录中的每个数据都是一个numpy数组,行数为456,列为55.另外,我总共有100个训练数据文件和20个用于验证的文件.这两个目录中的每个文件都包含数据clean(data1)和嘈杂的数据(data2)在单个.npz文件中.以下是我的生成器代码,但它不能正确地训练模型...任何人都可以帮助我找出问题所在..

Experts i need to train a model with many data sets saved in the directories train_data and valid_data. Each data in the corresponding directories are a numpy array having rows=456 and columns 55.Additionally i have total 100 number of training data files and 20 numbers are for validation.Here each file in both directories contain data clean(data1) and data noisy(data2) in single .npz file.Below is my generator code code but it doesn't help on training model properly...can anybody help me on finding out where the problem lies ..

def tf_train_generator(file_list, batch_size = 256):
    i = 0
    while True:
        if i*batch_size >= len(file_list):  
            i = 0
            np.random.shuffle(file_list)
        else:
            file_chunk = file_list[i*batch_size:(i+1)*batch_size]
            print(len(file_chunk))      


            for file in file_chunk:
                print(file)
                temp = np.load(file)

               
                X = temp['data1']
               
                Y= temp['data2']  


               
                i = i + 1
                yield X, Y

推荐答案

如果它类似于npz图片,则可以使用ImageDataGenerator

If it's npz Image-like, you can use ImageDataGenerator ImageDataGenerator. It supports both from_directory and from_dataframe.

这篇关于如何使用目录中存在的许多数据集训练深度学习模型的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆