有效地串联多个 pandas 系列 [英] Concatenate multiple pandas series efficiently
问题描述
我知道我可以使用combine_first
合并两个系列:
I understand that I can use combine_first
to merge two series:
series1 = pd.Series([1,2,3,4,5],index=['a','b','c','d','e'])
series2 = pd.Series([1,2,3,4,5],index=['f','g','h','i','j'])
series3 = pd.Series([1,2,3,4,5],index=['k','l','m','n','o'])
Combine1 = series1.combine_first(series2)
print(Combine1
输出:
a 1.0
b 2.0
c 3.0
d 4.0
e 5.0
f 1.0
g 2.0
h 3.0
i 4.0
j 5.0
dtype: float64
如果我需要合并3个或更多系列怎么办?
What if I need to merge 3 or more series?
我了解使用以下代码:print(series1 + series2 + series3)
收益:
I understand that using the following code: print(series1 + series2 + series3)
yields:
a NaN
b NaN
c NaN
d NaN
e NaN
f NaN
...
dtype: float64
是否可以有效地合并多个系列,而无需多次使用combine_first
?
Can I merge multiple series efficiently without using combine_first
multiple times?
谢谢
推荐答案
具有不重叠索引的组合系列
要垂直组合系列,请使用pd.concat
.
# Setup
series_list = [
pd.Series(range(1, 6), index=list('abcde')),
pd.Series(range(1, 6), index=list('fghij')),
pd.Series(range(1, 6), index=list('klmno'))
]
pd.concat(series_list)
a 1
b 2
c 3
d 4
e 5
f 1
g 2
h 3
i 4
j 5
k 1
l 2
m 3
n 4
o 5
dtype: int64
具有重叠索引的组合
series_list = [
pd.Series(range(1, 6), index=list('abcde')),
pd.Series(range(1, 6), index=list('abcde')),
pd.Series(range(1, 6), index=list('kbmdf'))
]
如果系列的索引重叠,则可以组合(添加)键,
If the Series have overlapping indices, you can either combine (add) the keys,
pd.concat(series_list, axis=1, sort=False).sum(axis=1)
a 2.0
b 6.0
c 6.0
d 12.0
e 10.0
k 1.0
m 3.0
f 5.0
dtype: float64
或者,如果只想获取第一个/最后一个值(当有重复项时),只需在索引上删除重复项值即可.
Alternatively, just drop duplicates values on the index if you want to take only the first/last value (when there are duplicates).
res = pd.concat(series_list, axis=0)
# keep first value
res[~res.index.duplicated(keep='first')]
# keep last value
res[~res.index.duplicated(keep='last')]
这篇关于有效地串联多个 pandas 系列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!