pandas :基于值的Drop()int64返回对象 [英] Pandas: Drop() int64 based on value returns object
本文介绍了 pandas :基于值的Drop()int64返回对象的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我需要删除一列低于特定值的所有行.我使用了下面的命令,但这将列作为对象返回.我需要将其保留为int64
:
I need to drop all rows where a one column are below a certain value. I used the command below, but this returns the column as an object. I need to keep it as int64
:
df["customer_id"] = df.drop(df["customer_id"][df["customer_id"] < 9999999].index)
df = df.dropna()
之后,我尝试将字段重新转换为int64
,但这会导致来自完全不同的列的数据出现以下错误:
I have tried to re-cast the field as int64
after, but this causes the following error with data from a totally different column:
invalid literal for long() with base 10: '2014/03/09 11:12:27'
推荐答案
我认为您需要 boolean indexing
与 reset_index
:
I think you need boolean indexing
with reset_index
:
import pandas as pd
df = pd.DataFrame({'a': ['s', 'd', 'f', 'g'],
'customer_id':[99999990, 99999997, 1000, 8888]})
print (df)
a customer_id
0 s 99999990
1 d 99999997
2 f 1000
3 g 8888
df1 = df[df["customer_id"] > 9999999].reset_index(drop=True)
print (df1)
a customer_id
0 s 99999990
1 d 99999997
使用drop
的解决方案,但速度较慢:
Solution with drop
, but is slowier:
df2 = (df.drop(df.loc[df["customer_id"] < 9999999, 'customer_id'].index))
print (df2)
a customer_id
0 s 99999990
1 d 99999997
时间:
In [12]: %timeit df[df["customer_id"] > 9999999].reset_index(drop=True)
1000 loops, best of 3: 676 µs per loop
In [13]: %timeit (df.drop(df.loc[df["customer_id"] < 9999999, 'customer_id'].index))
1000 loops, best of 3: 921 µs per loop
这篇关于 pandas :基于值的Drop()int64返回对象的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文