遍历 pandas 数据框列表 [英] Looping through a list of pandas dataframes

查看：82 发布时间：2020/5/2 5:52:05 python list pandas

本文介绍了遍历 pandas 数据框列表的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

两个大熊猫快速问题.

我有一个要对其应用过滤器的数据帧列表.

I have a list of dataframes I would like to apply a filter to.

countries = [us, uk, france]
for df in countries:
    df = df[(df["Send Date"] > '2016-11-01') & (df["Send Date"] < '2016-11-30')]

运行此命令后，df不会更改.这是为什么? 如果我遍历数据框以创建一个新列，如下所示，则可以正常工作，并更改列表中的每个df.

When I run this, the df's don't change afterwards. Why is that? If I loop through the dataframes to create a new column, as below, this works fine, and changes each df in the list.

 for df in countries:
      df["Continent"] = "Europe"

作为后续问题，当我为不同国家/地区创建数据框列表时，我注意到了一些奇怪的事情.我定义了列表，然后将转换应用于列表中的每个df.在转换了这些不同的dfs之后，我再次调用了该列表.我很惊讶地看到该列表仍然指向未更改的数据帧，因此我不得不重新定义该列表以更新结果.有人可以解释为什么会这样吗?

As a follow up question, I noticed something strange when I created a list of dataframes for different countries. I defined the list then applied transformations to each df in the list. After I transformed these different dfs, I called the list again. I was surprised to see that the list still pointed to the unchanged dataframes, and I had to redefine the list to update the results. Could anybody shed any light on why that is?

推荐答案

看看此答案，您可以看到for df in countries:等同于

Taking a look at this answer, you can see that for df in countries: is equivalent to something like

for idx in range(len(countries)):
    df = countries[idx]
    # do something with df

显然不会真正修改您列表中的任何内容.通常，在像这样的循环中迭代列表时，修改列表是一种不好的做法.

which obviously won't actually modify anything in your list. It is generally bad practice to modify a list while iterating over it in a loop like this.

一种更好的方法是列表理解，您可以尝试类似

A better approach would be a list comprehension, you can try something like

 countries = [us, uk, france]
 countries = [df[(df["Send Date"] > '2016-11-01') & (df["Send Date"] < '2016-11-30')]
              for df in countries]

请注意，通过这样的列表理解，我们实际上并没有修改原始列表，而是创建了一个新列表，并将其分配给保存原始列表的变量.

Notice that with a list comprehension like this, we aren't actually modifying the original list - instead we are creating a new list, and assigning it to the variable which held our original list.

此外，您可能会考虑将所有数据放在带有附加国家/地区列或类似内容的单个DataFrame中-Python级循环通常较慢，并且与DataGrid相比，使用DataFrames列表通常较不方便单个DataFrame，可以充分利用矢量化的熊猫方法.

Also, you might consider placing all of your data in a single DataFrame with an additional country column or something along those lines - Python-level loops are generally slower and a list of DataFrames is often much less convenient to work with than a single DataFrame, which can fully leverage the vectorized pandas methods.

这篇关于遍历 pandas 数据框列表的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

遍历 pandas 数据框列表 [英] Looping through a list of pandas dataframes

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

遍历 pandas 数据框列表 [英] Looping through a list of pandas dataframes

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭