获取最后的“列"对 pandas DataFrame中的列进行.str.split()操作后 [英] Get last "column" after .str.split() operation on column in pandas DataFrame
问题描述
我在pandas DataFrame中有一个列,想在一个空格上拆分.使用DataFrame.str.split(' ')
进行拆分非常简单,但是我无法从最后一个条目中创建新列.当我.str.split()
该列时,我得到了一个数组列表,我不知道如何操纵它来为我的DataFrame获取一个新列.
I have a column in a pandas DataFrame that I would like to split on a single space. The splitting is simple enough with DataFrame.str.split(' ')
, but I can't make a new column from the last entry. When I .str.split()
the column I get a list of arrays and I don't know how to manipulate this to get a new column for my DataFrame.
这里是一个例子.列中的每个条目均包含符号数据价格",我想将价格分开(并在一半的情况下最终删除"p" ...或"c").
Here is an example. Each entry in the column contains 'symbol data price' and I would like to split off the price (and eventually remove the "p"... or "c" in half the cases).
import pandas as pd
temp = pd.DataFrame({'ticker' : ['spx 5/25/2001 p500', 'spx 5/25/2001 p600', 'spx 5/25/2001 p700']})
temp2 = temp.ticker.str.split(' ')
产生
0 ['spx', '5/25/2001', 'p500']
1 ['spx', '5/25/2001', 'p600']
2 ['spx', '5/25/2001', 'p700']
但是temp2[0]
仅给出一个列表条目的数组,而temp2[:][-1]
失败.如何将每个数组中的最后一个条目转换为新列?谢谢!
But temp2[0]
just gives one list entry's array and temp2[:][-1]
fails. How can I convert the last entry in each array to a new column? Thanks!
推荐答案
您可以使用tolist
方法作为中介:
You could use the tolist
method as an intermediary:
In [99]: import pandas as pd
In [100]: d1 = pd.DataFrame({'ticker' : ['spx 5/25/2001 p500', 'spx 5/25/2001 p600', 'spx 5/25/2001 p700']})
In [101]: d1.ticker.str.split().tolist()
Out[101]:
[['spx', '5/25/2001', 'p500'],
['spx', '5/25/2001', 'p600'],
['spx', '5/25/2001', 'p700']]
您可以从中创建一个新的DataFrame:
From which you could make a new DataFrame:
In [102]: d2 = pd.DataFrame(d1.ticker.str.split().tolist(),
.....: columns="symbol date price".split())
In [103]: d2
Out[103]:
symbol date price
0 spx 5/25/2001 p500
1 spx 5/25/2001 p600
2 spx 5/25/2001 p700
出于良好的考虑,您可以确定价格:
For good measure, you could fix the price:
In [104]: d2["price"] = d2["price"].str.replace("p","").astype(float)
In [105]: d2
Out[105]:
symbol date price
0 spx 5/25/2001 500
1 spx 5/25/2001 600
2 spx 5/25/2001 700
PS:但是如果您真的只想要最后一列,则apply
就足够了:
PS: but if you really just want the last column, apply
would suffice:
In [113]: temp2.apply(lambda x: x[2])
Out[113]:
0 p500
1 p600
2 p700
Name: ticker
这篇关于获取最后的“列"对 pandas DataFrame中的列进行.str.split()操作后的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!