检查 pandas 数据框列中是否包含某些值 [英] Check if certain value is contained in a dataframe column in pandas

查看:311
本文介绍了检查 pandas 数据框列中是否包含某些值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我试图检查一个python列中是否包含某个值。我正在使用 df.date.isin(['07311954']),我不怀疑这是一个很好的工具。问题是我有超过350K行,输出将不显示
所有这些,以便我可以看到该值是否实际包含。简单地说,我只想知道(是/否)列中是否包含特定的值。我的代码如下:

I am trying to check if a certain value is contained in a python column. I'm using df.date.isin(['07311954']), which I do not doubt to be a good tool. The problem is that I have over 350K rows and the output won't show all of them so that I can see if the value is actually contained. Put simply, I just want to know (Y/N) whether or not a specific value is contained in a column. My code follows:

import numpy as np
import pandas as pd
import glob


df = (pd.read_csv('/home/jayaramdas/anaconda3/Thesis/FEC_data/itpas2_data/itpas214.txt',\
    sep='|', header=None, low_memory=False, names=['1', '2', '3', '4', '5', '6', '7', \
    '8', '9', '10', '11', '12', '13', 'date', '15', '16', '17', '18', '19', '20', \
    '21', '22']))

df.date.isin(['07311954'])


推荐答案

我想你需要 str.contains ,如果您需要列 date 包含字符串 07311954

I think you need str.contains, if you need rows where values of column date contains string 07311954:

print df[df['date'].astype(str).str.contains('07311954')]

或者如果类型日期列是 string

print df[df['date'].str.contains('07311954')]

如果你想检查最后4位数字日期 1954

If you want check last 4 digits for string 1954 in column date:

print df[df['date'].astype(str).str[-4:].str.contains('1954')]

示例:

print df['date']
0    8152007
1    9262007
2    7311954
3    2252011
4    2012011
5    2012011
6    2222011
7    2282011
Name: date, dtype: int64

print df['date'].astype(str).str[-4:].str.contains('1954')
0    False
1    False
2     True
3    False
4    False
5    False
6    False
7    False
Name: date, dtype: bool

print df[df['date'].astype(str).str[-4:].str.contains('1954')]
     cmte_id trans_typ entity_typ state  employer  occupation     date  \
2  C00119040       24K        CCM    MD       NaN         NaN  7311954   

   amount     fec_id    cand_id  
2    1000  C00140715  H2MD05155  

这篇关于检查 pandas 数据框列中是否包含某些值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆