用 pandas 搜索整个Excel工作表中的单词 [英] Search entire excel sheet with Pandas for word(s)
问题描述
我正在尝试使用Pandas在Python中实质上复制Find函数(control-f).我想搜索整个工作表(所有行和列),以查看工作表上的任何单元格是否包含单词,然后打印出在其中找到该单词的行.我也想在多张纸上进行此操作.
I am trying to essentially replicate the Find function (control-f) in Python with Pandas. I want to search and entire sheet (all rows and columns) to see if any of the cells on the sheet contain a word and then print out the row in which the word was found. I'd like to do this across multiple sheets as well.
我已导入工作表:
pdTestDataframe = pd.read_excel(TestFile, sheet_name="Sheet Name",
keep_default_na= False, na_values=[""])
并尝试创建一个列列表,我可以将其索引到所有单元格的值中,但仍排除工作表中的许多单元格.尝试的代码如下.
And tried to create a list of columns that I could index into the values of all of the cells but it's still excluding many of the cells in the sheet. The attempted code is below.
columnsList = []
for i, data in enumerate(pdTestDataframe.columns):
columnList.append(pdTestDataframe.columns[i])
for j, data1 in enumerate(pdTestDataframe.index):
print(pdTestDataframe[columnList[i]][j])
我想确保无论excel表格的格式如何,都可以在其中包含数据的所有单元格中搜索单词.希望能得到我的任何帮助!
I want to make sure that no matter the formatting of the excel sheet, all cells with data inside can be searched for the word(s). Would love any help I can get!
推荐答案
熊猫对此有不同的看法.只需调用df[df.text_column.str.contains('whatever')]
,即可显示一列中包含文本的所有行.要搜索整个数据框,可以使用:
Pandas has a different way of thinking about this. Just calling df[df.text_column.str.contains('whatever')]
will show you all the rows in which the text is contained in one specific column. To search the entire dataframe, you can use:
mask = np.column_stack([df[col].str.contains(r"\^", na=False) for col in df])
df.loc[mask.any(axis=1)]
(来源为此处)
这篇关于用 pandas 搜索整个Excel工作表中的单词的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!