计算字符串的出现次数，不区分大小写的搜索 [英] count the occurence of a string, case insensitive search

查看：118 发布时间：2019/6/14 1:17:28 Python

本文介绍了计算字符串的出现次数，不区分大小写的搜索的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在尝试读取文件并获取字符串出现的计数，而不考虑大小写（上/下）。但我的代码没有给出理想的结果。

为什么会这样？ 另外如何使我的搜索不区分大小写？

代码是：

  import  os，re 
 
 
 fileName_path = input（ 请输入带位置的文件名：）
 directory = os.path.dirname（fileName_path）
 os.chdir（directory）
 
 fileName = os.path.basename（fileName_path）
 openFile = open（fileName，  r ）
 
 cnt =  0  
 
   openFile  as  readFile：
  for  searchpattern  in  readFile：
  if  '  tempCharSearch'    searchpattern：
 cnt + =  1  
 
 openFile.close（）
  print  （cnt）

在文本文件中有14个tempCharSearch，但结果只显示3，为什么会这样？

此处附带的文本文件：

  Lorem   Ipsum    简单 虚拟  text    tempCharSearch ：='100-111-875'打印 和 排版 行业。  Lorem   Ipsum    已 tempCharSearch：='100-111-875'行业的标准  dummy   text  永远 自 tempCharSearch：=' 100-111-875' 1500s ，    未知  printer  参加    galley     type  和  scrambled   it   to   make   a  类型 标本 书。 
 
 它   幸存 不 仅 五 几个世纪，但   tempCharSearch：='100- 111-875' leap   into   electronic  < span class =code-leadattribute>排版，剩余 基本上 不变。      popularized   in  tempCharSearch：='100-111-875' 1960s   with  tempCharSearch：='100-111-875' release     Letraset  表 包含  Lorem   Ipsum  段落，和 更多 最近   桌面 发布  software   like   Aldus   PageMaker  包括 版本 的  Lorem   Ipsum 。 
 
 tempCharSearch：='100-111-875're   很多 < span class =code-leadattribute> variants    段落    Lorem   Ipsum  可用，但是 tempCharSearch：='100-111-875'多数   遭遇 更改  in   some   form ， by  注入 幽默，或  randomized   words   不外观 甚至 略 可信。 如果 您   正在 到 使用   段落    Lorem   Ipsum ，您 需要     确定 tempCharSearch：='100-111-875 '不是任何 令人尴尬 隐藏   tempCharSearch：='100-111-875'  text 的e> middle  。 所有 tempCharSearch：='100-111-875' Lorem     生成器   tempCharSearch：='100-111-875' Internet   tend   to  重复 预定义  chunks   as  必要，制作 此 tempCharSearch：=' 100-111-875' first   true   generator   tempCharSearch：='100-111-875 '互联网。 它 使用  a    over   code-leadattribute> dictionary   200  拉丁语 字，合并    a  少数    model  句子 结构，到 生成  Lorem   Ipsum   whi ch  看起来 合理。 tempCharSearch：='100-111-875'生成  Lorem   Ipsum    tempCharSearch：='100-111-875'refore 始终  free  来自 重复，注入 幽默，或 非特征 字  etc 。

解决方案

您的代码不计算文件中tempCharSearch的出现次数，而是计算出现模式的行数。由于您的输入文件似乎只有三行，每个人都包含g多次出现，结果为3.

使用Python的内置字符串计数方法计算一行中的所有事件：
 cnt + = searchpattern.count（'  tempCharSearch'）; 
 
如果你想比较不区分大小写，那么在运行之前将行字符串和搜索模式转换为小写计数，例如：
  行>  readFile：
 cnt + = line.lower（）。count（'  tempcharsearch'）; 

I am trying to read a file and get the count of occurence of a string irrespective of case(upper/lower). But my code is not giving desired results.

Why is it so? Also how can I make my search case insensitive?
code is:

import os,re


fileName_path = input ("Please input the file name with location: ")
directory = os.path.dirname(fileName_path)
os.chdir(directory)

fileName = os.path.basename(fileName_path)
openFile = open(fileName ,"r")

cnt = 0

with openFile as readFile:
    for searchpattern in readFile:
        if 'tempCharSearch' in searchpattern:
            cnt += 1

openFile.close()
print (cnt)

In the text file there are 14 tempCharSearch, but the result is showing only 3, why is it so?
The text file attached here with:

Lorem Ipsum is simply dummy text of tempCharSearch:='100-111-875' printing and typesetting industry. Lorem Ipsum has been tempCharSearch:='100-111-875' industry's standard dummy text ever since tempCharSearch:='100-111-875' 1500s, when an unknown printer took a galley of type and scrambled it to make a type specimen book.

It has survived not only five centuries, but also tempCharSearch:='100-111-875' leap into electronic typesetting, remaining essentially unchanged. It was popularised in tempCharSearch:='100-111-875' 1960s with tempCharSearch:='100-111-875' release of Letraset sheets containing Lorem Ipsum passages, and more recently with desktop publishing software like Aldus PageMaker including versions of Lorem Ipsum.

tempCharSearch:='100-111-875're are many variations of passages of Lorem Ipsum available, but tempCharSearch:='100-111-875' majority have suffered alteration in some form, by injected humour, or randomised words which don't look even slightly believable. If you are going to use a passage of Lorem Ipsum, you need to be sure tempCharSearch:='100-111-875're isn't anything embarrassing hidden in tempCharSearch:='100-111-875' middle of text. All tempCharSearch:='100-111-875' Lorem Ipsum generators on tempCharSearch:='100-111-875' Internet tend to repeat predefined chunks as necessary, making this tempCharSearch:='100-111-875' first true generator on tempCharSearch:='100-111-875' Internet. It uses a dictionary of over 200 Latin words, combined with a handful of model sentence structures, to generate Lorem Ipsum which looks reasonable. tempCharSearch:='100-111-875' generated Lorem Ipsum is tempCharSearch:='100-111-875'refore always free from repetition, injected humour, or non-characteristic words etc.

解决方案

Your code is not counting the number of occurrences of "tempCharSearch' in the file, but the number of lines, in which the pattern occurs. As your input file appears to have just three lines, each one containing multiple occurrences, your result is 3.

Use Python's built in string count method to count all occurrences in a line:
cnt += searchpattern.count ('tempCharSearch');
If you want to compare case insensitive then convert both the line string and your search pattern to lower-case before running the count, for example:
for line in readFile:
    cnt += line.lower().count ('tempcharsearch');

这篇关于计算字符串的出现次数，不区分大小写的搜索的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

计算字符串的出现次数，不区分大小写的搜索 [英] count the occurence of a string, case insensitive search

问题描述

相关文章

其他开发语言最新文章

热门教程

热门工具

登录关闭

计算字符串的出现次数，不区分大小写的搜索 [英] count the occurence of a string, case insensitive search

问题描述

相关文章

其他开发语言最新文章

热门教程

热门工具

登录 关闭

登录关闭