将多个excel文件导入到python大 pandas 中,并将它们连接成一个数据帧 [英] Import multiple excel files into python pandas and concatenate them into one dataframe

查看:174
本文介绍了将多个excel文件导入到python大 pandas 中,并将它们连接成一个数据帧的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想从目录中读取几个excel文件到大熊猫,并将它们连接成一个大数据帧。我一直没能想出来。我需要一些关于for循环的帮助,并建立一个连接的数据框:
这是我到目前为止:

 导入sys 
import csv
import glob
import pandas as pd

#获取数据文件名
path = r'C:\DRO\ DCL_rawdata_files\excelfiles'
filenames = glob.glob(path +/*.xlsx)

dfs = []

dfs中的dfs:
xl_file = pd.ExcelFile(filenames)
df = xl_file.parse('Sheet1')
dfs.concat(df,ignore_index = True)


解决方案

正如在评论中提到的,你所做的一个错误是你循环一个空的列表。 / p>

这是我将如何做,使用一个接一个地附加5个相同的Excel文件的例子。



(1)导入:

  import os 
import pandas as pd

(2)列出文件:

  path = os.getcwd()
文件= os.listdir(路径)
文件

输出:

  ['DS_Store',
'.ipynb_checkpoints',
'.localized',
'屏幕截图2013-12-28 at 7.15.45 PM.png',
'test1 2.xls',
'test1 3.xls',
'test1 4.xls',
'test1 5.xls',
'test1.xls ',
'Untitled0.ipynb',
'Werewolf Modeling',
'〜$ Random Numbers.xlsx']

(3)选择'xls'文件:

  files_xls = [f for f in files if f [-3:] =='xls'] 
files_xls

输出:

  ['test1 2.xls','test1 3.xls' 'test1 4.xls','test1 5.xls','test1.xls'] 

(4)初始化空数据框:

  df = pd.DataFrame()

(5)循环文件列表以追加到空数据框:

  for f in files_xls:
data = pd.read_excel(f,'Sheet1')
df = df.append(data)

(6)享受您的新数据框。 : - )

  df 

输出:

 结果样本
0 a 1
1 b 2
2 c 3
3 d 4
4 e 5
5 f 6
6 g 7
7 h 8
8我9
9 j 10
0 a 1
1 b 2
2 c 3
3 d 4
4 e 5
5 f 6
6 g 7
7 h 8
8 i 9
9 j 10
0 a 1
1 b 2
2 c 3
3 d 4
4 e 5
5 f 6
6 g 7
7 h 8
8 i 9
9 j 10
0 a 1
1 b 2
2 c 3
3 d 4
4 e 5
5 f 6
6 g 7
7 h 8
8 i 9
9 j 10
0 a 1
1 b 2
2 c 3
3 d 4
4 e 5
5 f 6
6 g 7
7 h 8
8 i 9
9 j 10


I would like to read several excel files from a directory into pandas and concatenate them into one big dataframe. I have not been able to figure it out though. I need some help with the for loop and building a concatenated dataframe: Here is what I have so far:

import sys
import csv
import glob
import pandas as pd

# get data file names
path =r'C:\DRO\DCL_rawdata_files\excelfiles'
filenames = glob.glob(path + "/*.xlsx")

dfs = []

for df in dfs: 
    xl_file = pd.ExcelFile(filenames)
    df=xl_file.parse('Sheet1')
    dfs.concat(df, ignore_index=True)

解决方案

As mentioned in the comments, one error you are making is that you are looping over an empty list.

Here is how I would do it, using an example of having 5 identical Excel files that are appended one after another.

(1) Imports:

import os
import pandas as pd

(2) List files:

path = os.getcwd()
files = os.listdir(path)
files

Output:

['.DS_Store',
 '.ipynb_checkpoints',
 '.localized',
 'Screen Shot 2013-12-28 at 7.15.45 PM.png',
 'test1 2.xls',
 'test1 3.xls',
 'test1 4.xls',
 'test1 5.xls',
 'test1.xls',
 'Untitled0.ipynb',
 'Werewolf Modelling',
 '~$Random Numbers.xlsx']

(3) Pick out 'xls' files:

files_xls = [f for f in files if f[-3:] == 'xls']
files_xls

Output:

['test1 2.xls', 'test1 3.xls', 'test1 4.xls', 'test1 5.xls', 'test1.xls']

(4) Initialize empty dataframe:

df = pd.DataFrame()

(5) Loop over list of files to append to empty dataframe:

for f in files_xls:
    data = pd.read_excel(f, 'Sheet1')
    df = df.append(data)

(6) Enjoy your new dataframe. :-)

df

Output:

  Result  Sample
0      a       1
1      b       2
2      c       3
3      d       4
4      e       5
5      f       6
6      g       7
7      h       8
8      i       9
9      j      10
0      a       1
1      b       2
2      c       3
3      d       4
4      e       5
5      f       6
6      g       7
7      h       8
8      i       9
9      j      10
0      a       1
1      b       2
2      c       3
3      d       4
4      e       5
5      f       6
6      g       7
7      h       8
8      i       9
9      j      10
0      a       1
1      b       2
2      c       3
3      d       4
4      e       5
5      f       6
6      g       7
7      h       8
8      i       9
9      j      10
0      a       1
1      b       2
2      c       3
3      d       4
4      e       5
5      f       6
6      g       7
7      h       8
8      i       9
9      j      10

这篇关于将多个excel文件导入到python大 pandas 中,并将它们连接成一个数据帧的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆