合并大 pandas 中的行 [英] Combining rows in pandas

查看：43 发布时间：2020/5/23 22:38:28 python pandas

本文介绍了合并大 pandas 中的行的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有一个DataFrame，其索引名为city_id城市的索引，格式为[city],[state](例如，new york,ny的列中包含整数计数.问题是我在同一城市有多个行，并且想要通过添加列值来折叠共享city_id的行.我查看了groupby()，但如何将其应用于此问题尚不是很明显.

I have a DataFrame with an index called city_id of cities in the format [city],[state] (e.g., new york,ny containing integer counts in the columns. The problem is that I have multiple rows for the same city, and I want to collapse the rows sharing a city_id by adding their column values. I looked at groupby() but it wasn't immediately obvious how to apply it to this problem.

一个例子:我想改变这个:

An example: I'd like to change this:

city_id    val1 val2 val3
houston,tx    1    2    0
houston,tx    0    0    1
houston,tx    2    1    1

对此:

city_id    val1 val2 val3
houston,tx    3    3    2

如果有大约10-20k行.

if there are ~10-20k rows.

推荐答案

从

>>> df
              val1  val2  val3
city_id                       
houston,tx       1     2     0
houston,tx       0     0     1
houston,tx       2     1     1
somewhere,ew     4     3     7

我可能会

>>> df.groupby(df.index).sum()
              val1  val2  val3
city_id                       
houston,tx       3     3     2
somewhere,ew     4     3     7

或

>>> df.reset_index().groupby("city_id").sum()
              val1  val2  val3
city_id                       
houston,tx       3     3     2
somewhere,ew     4     3     7

第一种方法将索引值(在这种情况下为city_id值)传递给groupby，并告诉其将其用作组键，第二种方法重置索引，然后选择city_id列.有关更多示例，请参见文档的本节.请注意，DataFrameGroupBy对象中还有许多其他方法:

The first approach passes the index values (in this case, the city_id values) to groupby and tells it to use those as the group keys, and the second resets the index and then selects the city_id column. See this section of the docs for more examples. Note that there are lots of other methods in the DataFrameGroupBy objects, too:

>>> df.groupby(df.index)
<pandas.core.groupby.DataFrameGroupBy object at 0x1045a1790>
>>> df.groupby(df.index).max()
              val1  val2  val3
city_id                       
houston,tx       2     2     1
somewhere,ew     4     3     7
>>> df.groupby(df.index).mean()
              val1  val2      val3
city_id                           
houston,tx       1     1  0.666667
somewhere,ew     4     3  7.000000

这篇关于合并大 pandas 中的行的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

合并大 pandas 中的行 [英] Combining rows in pandas

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

合并大 pandas 中的行 [英] Combining rows in pandas

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭