python-为Keras LSTM读取多个CSV [英] python - Reading multiple CSVs for Keras LSTM

查看:575
本文介绍了python-为Keras LSTM读取多个CSV的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试使用Keras实施LSTM网络,但是在输入时遇到了问题. 我的数据集采用多个CSV文件的形式(所有文件的尺寸均为68x250,每个条目均包含2个值).各个类别之间大约有200个CSV文件. 其中一个CSV的预览

I'm trying to implement a LSTM network using Keras but I'm having problems with taking input. My dataset is in the form of multiple CSV files (all files have same dimensions 68x250 with each entry containing 2 values). There are about 200 CSV files, between various classes. Preview of one of the CSVs

如何将这些多个CSV作为输入?

How do i take these multiple CSVs as input?

推荐答案

最近我做过类似的事情,因为Pedro说您应该使用fit_generator并编写自定义生成器.

I did something similar recently, as Pedro said you shoudl use fit_generator and write your custom generator.

以下是生成器的示例:

def generator(files):
    print('start generator')
    while 1:        
        print('loop generator')
        for file in files:
            try:                 
                df = pd.read_csv(file)
                batches = int(np.ceil(len(df)/batch_size))      
                for i in range(0, batches):                                   
                    yield pad_batch(df[i*batch_size:min(len(df), i*batch_size+batch_size)])

            except EOFError:
                print("error" + file) 

将文件名列表传递给生成器的位置,然后遍历文件并分批返回内容.在我的情况下,load_data是一个函数,它读取熊猫中的csvs并进行一些预处理. pad_batch对LSTM进行填充.

Where you pass the list of filename to the generator and it iterates through the files and returns the content in batches. load_data is in my case a function which reads csvs in pandas and does some preprocessing. pad_batch does the padding for the LSTM.

用法:

model.fit_generator(
      generator=generator(trainingFiles),   
      steps_per_epoch=steps,
      epochs=num_epochs,
      validation_data=[x_test, y_test],
      verbose=1)

这篇关于python-为Keras LSTM读取多个CSV的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆