用 pandas 搜索整个Excel工作表中的单词 [英] Search entire excel sheet with Pandas for word(s)

查看:83
本文介绍了用 pandas 搜索整个Excel工作表中的单词的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试使用Pandas在Python中实质上复制Find函数(control-f).我想搜索整个工作表(所有行和列),以查看工作表上的任何单元格是否包含单词,然后打印出在其中找到该单词的行.我也想在多张纸上进行此操作.

I am trying to essentially replicate the Find function (control-f) in Python with Pandas. I want to search and entire sheet (all rows and columns) to see if any of the cells on the sheet contain a word and then print out the row in which the word was found. I'd like to do this across multiple sheets as well.

我已导入工作表:

pdTestDataframe = pd.read_excel(TestFile, sheet_name="Sheet Name", 
keep_default_na= False, na_values=[""])

并尝试创建一个列列表,我可以将其索引到所有单元格的值中,但仍排除工作表中的许多单元格.尝试的代码如下.

And tried to create a list of columns that I could index into the values of all of the cells but it's still excluding many of the cells in the sheet. The attempted code is below.

columnsList = []
for i, data in enumerate(pdTestDataframe.columns):
    columnList.append(pdTestDataframe.columns[i])
for j, data1 in enumerate(pdTestDataframe.index):
    print(pdTestDataframe[columnList[i]][j])

我想确保无论excel表格的格式如何,都可以在其中包含数据的所有单元格中搜索单词.希望能得到我的任何帮助!

I want to make sure that no matter the formatting of the excel sheet, all cells with data inside can be searched for the word(s). Would love any help I can get!

推荐答案

熊猫对此有不同的看法.只需调用df[df.text_column.str.contains('whatever')],即可显示一列中包含文本的所有行.要搜索整个数据框,可以使用:

Pandas has a different way of thinking about this. Just calling df[df.text_column.str.contains('whatever')] will show you all the rows in which the text is contained in one specific column. To search the entire dataframe, you can use:

mask = np.column_stack([df[col].str.contains(r"\^", na=False) for col in df])
df.loc[mask.any(axis=1)]

(来源为此处)

这篇关于用 pandas 搜索整个Excel工作表中的单词的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆