将列元素转换为pandas中的列名称 [英] Convert column elements to column name in pandas

查看:205
本文介绍了将列元素转换为pandas中的列名称的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个大的 .csv 文件,它不断被实时更新,几千行显示如下:

I have a large .csv file which is constantly being updated in real time with several thousand lines displayed as follows:

 time1,stockA,bid,1
 time2,stockA,ask,1.1
 time3,stockB,ask,2.1
 time4,stockB,bid,2.0
 time5,stockA,bid,1.1
 time6,stockA,ask,1.2


$ b b

将此数据读入数据帧中的最快方法是什么:

What is the fastest way to read this into a dataframe that looks like this:

   time     stock       bid    ask
   time1    stockA      1      
   time2    stockA             1.1
   time3    stockB             2.1
   time4    stockB      2.0    
   time5    stockA      1.1
   time6    stockA             1.2

任何帮助

推荐答案

可以使用 read_csv 并指定 header = None 名称作为列表:

You can use read_csv and specify header=None and pass the column names as a list:

In [124]:

t="""time1,stockA,bid,1
 time2,stockA,ask,1.1
 time3,stockB,ask,2.1
 time4,stockB,bid,2.0"""
​
df = pd.read_csv(io.StringIO(t), header=None, names=['time', 'stock', 'bid', 'ask'])
df
Out[124]:
     time   stock  bid  ask
0   time1  stockA  bid  1.0
1   time2  stockA  ask  1.1
2   time3  stockB  ask  2.1
3   time4  stockB  bid  2.0

您必须将出价栏重新编码为1或2:

You'll have to re-encode the bid column to 1 or 2:

In [126]:

df['bid'] = df['bid'].replace('bid', 1)
df['bid'] = df['bid'].replace('ask', 2)
df
Out[126]:
     time   stock  bid  ask
0   time1  stockA    1  1.0
1   time2  stockA    2  1.1
2   time3  stockB    2  2.1
3   time4  stockB    1  2.0

编辑

根据您更新的样本数据和所需输出,

Based on your updated sample data and desired output the following works:

In [29]:

t="""time1,stockA,bid,1
 time2,stockA,ask,1.1
 time3,stockB,ask,2.1
 time4,stockB,bid,2.0
 time5,stockA,bid,1.1
 time6,stockA,ask,1.2"""
​
df = pd.read_csv(io.StringIO(t), header=None, names=['time', 'stock', 'bid', 'ask'])
df
Out[29]:
     time   stock  bid  ask
0   time1  stockA  bid  1.0
1   time2  stockA  ask  1.1
2   time3  stockB  ask  2.1
3   time4  stockB  bid  2.0
4   time5  stockA  bid  1.1
5   time6  stockA  ask  1.2
In [30]:

df.loc[df['bid'] == 'bid', 'bid'] = df['ask']
df.loc[df['bid'] != 'ask', 'ask'] = ''
df.loc[df['bid'] == 'ask','bid'] = ''
df
Out[30]:
     time   stock  bid  ask
0   time1  stockA    1     
1   time2  stockA       1.1
2   time3  stockB       2.1
3   time4  stockB    2     
4   time5  stockA  1.1     
5   time6  stockA       1.2

这篇关于将列元素转换为pandas中的列名称的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆