如何迭代多个数据框以在每个python中选择一列？ [英] How can I iterate through multiple dataframes to select a column in each in python?

查看：146 发布时间：2017/3/26 3:59:43 python pandas dataframe analytics jupyter

本文介绍了如何迭代多个数据框以在每个python中选择一列？的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

对于我的项目，我在一个csv文件中读取美国各州的数据。我的函数将其中的每一个转换为单独的数据框，因为我需要对每个州的信息执行操作。

  def RanktoDF（csvFile） ：
 df = pd.read_csv（csvFile）
 df = df [pd.notnull（df ['Index']）]＃删除所有空值
 df = df [df.Index！ ='Index'] #Drop所有额外的标题
 df = df.set_index（'State'）＃设置状态为索引
返回df

我将此函数应用于我的每个文件，并使用数组中的名称返回df varNames

 用于名称，zip（glob.glob（'*。csv'），varNames）：
 vars（）[Crime+ s] = RanktoDF（name）

所有这些都是完美的。
我的问题是，我也想创建一个Dataframe，它由这些状态数据框中的每一列组成。

我尝试过遍历列表的数据框，并选择我想要添加到新数据框的列（总体）：

dfList

  dfNewIndex = pd.DataFrame（index = CrimeRank_1980_df .index）＃创建新的DF，索引
 
 
在dfList中的名称：#dfList是我的数据框列表。看到图像
 newIndex = name ['Population'] 
 dfNewIndex.append（newIndex）
 
 #dfNewIndex = pd.concat（[dfNewIndex，dfList [name ['Population'] ]，axis = 1）

我的错误总是相同的，告诉我名字被视为字符串而不是实际的数据帧

  TypeError Traceback（最近的最后一次调用）
< ipython-input-30- 5aa85b0174df>在< module>（）
 3 
 4 dfList中的名称：
 ----> 5 newIndex = name ['Index'] 
 6 dfNewIndex.append（newIndex）
 7＃dfNewIndex = pd.concat（[dfNewIndex，dfList [name ['Population']]，axis = 1）
 
 TypeError：字符串索引必须是整数

我明白我的列表是列表的字符串而不是变量/数据框，所以我的问题是我如何纠正我的代码，以便能够做我想要的或有更简单的方法做这个？

我查找的任何解决方案都给出了明确键入数据框以便连接的答案，但是我有50个，所以它有点不可行。任何帮助将不胜感激。

解决方案

一种方法是索引到vars（），例如

  dfList中的名称：
 newIndex = vars（）[name] [人口] 
 或者，我认为将数据框存储在容器中并将其迭代更为简单，例如
 
 
 $（$ * code $ frame $ {
 
） [犯罪+ s] = RanktoDF（名称）
 
在框架中的名称：
 newIndex = frames [name] [人口]

For my project I'm reading in a csv file with data from every State in the US. My function converts each of these into a separate Dataframe as I need to perform operations on each State's information.

def RanktoDF(csvFile):
    df = pd.read_csv(csvFile)
    df = df[pd.notnull(df['Index'])] # drop all null values
    df = df[df.Index != 'Index'] #Drop all extra headers
    df= df.set_index('State') #Set State as index
    return df

I apply this function to every one of my files and return the df with a name from my array varNames

for name , s in zip (glob.glob('*.csv'), varNames):
    vars()["Crime" + s] = RanktoDF(name)

All of that works perfectly. My problem is that I also want to create a Dataframe thats made up of one column from each of those State Dataframes.

I have tried iterating through a list of my dataframes and selecting the column (population) i want to append it to a new Dataframe:

dfList

dfNewIndex = pd.DataFrame(index=CrimeRank_1980_df.index) # Create new DF with Index


for name in dfList:  #dfList is my list of dataframes. See image
    newIndex = name['Population']
    dfNewIndex.append(newIndex)

    #dfNewIndex = pd.concat([dfNewIndex, dfList[name['Population']], axis=1)

My error is always the same which tells me that name is viewed as a string rather than an actual Dataframe

TypeError                                 Traceback (most recent call last)
<ipython-input-30-5aa85b0174df> in <module>()
      3 
      4 for name in dfList:
----> 5     newIndex = name['Index']
      6     dfNewIndex.append(newIndex)
      7 #     dfNewIndex = pd.concat([dfNewIndex, dfList[name['Population']], axis=1)

TypeError: string indices must be integers

I understand that my list is a list of Strings rather than variables/dataframes so my question is how can i correct my code to be able to do what i want or is there an easier way of doing this?

Any solutions I've looked up have given answers where the dataframes are explicitly typed in order to be concatenated but I have 50 so its a little unfeasible. Any help would be appreciated.

解决方案

One way would be to index into vars(), e.g.

for name in dfList:
    newIndex = vars()[name]["Population"]

Alternatively I think it would be neater to store your dataframes in a container and iterate through that, e.g.

frames = {}

for name, s in zip(glob.glob('*.csv'), varNames):
    frames["Crime" + s] = RanktoDF(name)

for name in frames:
    newIndex = frames[name]["Population"]

这篇关于如何迭代多个数据框以在每个python中选择一列？的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

如何迭代多个数据框以在每个python中选择一列？ [英] How can I iterate through multiple dataframes to select a column in each in python?

问题描述

相关文章

Python最新文章

热门教程

热门工具

登录关闭

如何迭代多个数据框以在每个python中选择一列？ [英] How can I iterate through multiple dataframes to select a column in each in python?

问题描述

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭