读取多个csv文件(大小为mxm)，并以n维数组的形式加载(大小为nxmxm)(不串联) [英] Read multiple csv files (size mxm) and load as an n dimensional array (size nxmxm) (not concatenate)

查看：76 发布时间：2021/4/27 19:51:10 python pandas csv numpy

本文介绍了读取多个csv文件(大小为mxm)，并以n维数组的形式加载(大小为nxmxm)(不串联)的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在开发一个程序，该程序需要将大量的csv文件(成千上万个)加载到数组中.

I'm working on a program that requires loading of a large number of csv files (thousands of them) into an array.

csv文件的尺寸为45x100，我想创建一个尺寸为nx45x100的3-d数组.现在，我正在使用pd.read_csv()加载每个csv文件，然后使用np.array()将它们转换为数组.然后，我使用np.array(data_0，data_1，...，data_n)创建3d数组，并获得具有所需尺寸的3d数组.

The csv files are of dimension 45x100, and I want to create a 3-d array with dimension nx45x100. For now, I am using pd.read_csv() to load each csv file and then convert each into an array using np.array(). I then create a 3d array using np.array(data_0, data_1,...,data_n), to which I get a 3-d array with the required dimensions.

尽管有效，但非常繁琐.无需单独读取和处理每个csv文件，有什么方法可以做到?

Although it works, it is very tedious. Is there any way that this can be done without individually reading and processing each csv file?

   #this is my current code
   import numpy as np
   import pandas as pd
   from pandas import Series, DataFrame

   mBGS5L = pd.read_csv("strain5.csv") #45x100 
   mBGS8L = pd.read_csv("strain8.csv")
   mBGS10L = pd.read_csv("strain10.csv")

   mBGS5L_ = np.array(mBGS5L)
   mBGS8L_ = np.array(mBGS8L)
   mBGS10L_ = np.array(mBGS10L)

   mBGS = np.array([mBGS5L_,mBGS8L_,mBGS10L_])
   #to which mBGS.shape returns a 3x45x100 array'''

注意:我已经在将多个csv文件加载到1个数据帧中时检查了其他stackoverflow链接，我了解了glob以获取所需的所有csv文件的列表.但是我的问题是，使用glob并连接csv文件会返回一个列表，而不是3d数组--我无法将其转换为numpy数组，因为它会返回错误

Note: I have checked other stackoverflow links on loading multiple csv files into 1 dataframe, to which I learned about glob to get the list of all csv files I need. My problem though is that using glob and concatenating the csv files returns a list and not a 3d array---which I can't convert to numpy array as it returns an error

   from glob import glob
   strain = glob("strain*.csv")
   df= [pd.read_csv(f) for f in strain]
   df_ = np.asarray(df)
   #this returns an error: cannot copy sequence with size 45 to array axis with dimension 30

任何帮助将不胜感激.谢谢

Any help would be greatly appreciated. Thanks

读取多个csv文件(大小为mxm)，并以n维数组的形式加载(大小为nxmxm)(不串联) [英] Read multiple csv files (size mxm) and load as an n dimensional array (size nxmxm) (not concatenate)

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

读取多个csv文件(大小为mxm)，并以n维数组的形式加载(大小为nxmxm)(不串联) [英] Read multiple csv files (size mxm) and load as an n dimensional array (size nxmxm) (not concatenate)

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭