查找字符串中的所有字符串 [英] Find all strings within a string

查看:98
本文介绍了查找字符串中的所有字符串的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我通过 http 请求响应文本获取 StrTxt 作为 html 字符串.我想在 StrTxt 中找到所有出现的 '"string"'.

I get StrTxt as html string by http request response text. I want to find all occurrences of '"string"' in the StrTxt.

类似的东西.

for each string in StrTxt
StrTxt = "all matched strings from StrTxt"
do something StrTxt.

编辑这被标记为可能重复,但事实并非如此.如何循环遍历每个单词word 文档 - VBA 宏 解释了如何在文档中查找字符串而不是字符串.

Edit This is tagged as possible duplicate but it's not. How to loop through each word in a word document - VBA Macro explains how to find string in document and not string.

很简单.如何找到所有带有字符串的字符串?我的标题不是说明了一切吗?

It is just simple. How to find all strings withing strings? Isn't my title explains everything?

编辑 2

根据 Ansgar Wiechers 的回答,我尝试了以下方法.

From answer of Ansgar Wiechers I tried following.

Do
i = InStr(strtxt, "startstring")
      If i > 0 Then
        strtxt = Mid(strtxt, i, Len(strtxt) - i)
        i = InStr(InStr(strtxt, "midstring") + 1, strtxt, "endstring")
        If i > 0 Then
         strtxt = Left(strtxt, i + Len(endstring)) ' I am using i+4 as I know the length
        WScript.Echo strtxt
        End If
      End If
Loop While i > 0

它只出现一次.如何正确循环?

It gives only one occurences. How to loop correctly?

推荐答案

如果您想使用 InStr 来搜索字符串中所有出现的特定子字符串,您需要重复调​​用该函数,开始每次新搜索(至少)在最后一次匹配后一个字符.

If you want to use InStr to search a string for all occurrences of a particular substring you need to call the function repeatedly, starting each new search (at least) one character after the last match.

response = "..."  'your HTTP response string
srch     = "..."  'the string you want to find in the response

start = 1
Do
  pos = InStr(start, response, srch, vbTextCompare)
  If pos > 0 Then
    start = pos + 1  'alternatively: start = pos + Len(srch)
    WScript.Echo pos
    WScript.Echo Mid(response, pos, Len(srch))
  End If
Loop While pos > 0

如果您希望比较区分大小写,请将 vbTextCompare 替换为 vbBinaryCompare.

If you want the comparison to be case-sensitive replace vbTextCompare with vbBinaryCompare.

要查找以某个字符串开头、包含另一个字符串并以第三个字符串结尾的模式,最好使用 正则表达式.@TylerStandishMan 已经在 他的回答,但在您的场景中需要注意一些事项.

For finding patterns that start with some string, contain another, and end with a third one it's probably best to use a regular expression. @TylerStandishMan already showed the basic principle in his answer, but there are some things to observe in your scenario.

response = "..."  'your HTTP response string

startTerm = "startstring"
midTerm   = "midstring"
endTerm   = "endstring"

Set re = New RegExp
re.Pattern    = startTerm & "[\s\S]*?" & midTerm & "[\s\S]*?" & endTerm
re.Global     = True
re.IgnoreCase = True  'if required

For Each m In re.Execute(response)
  WScript.Echo m
Next

正则表达式中的一些字符 具有特殊含义(例如 . 匹配除换行符之外的任何字符),因此您需要确保开始、中间和结束术语中的任何此类字符都已正确转义(例如使用 \. 用于匹配文字点).如果您要匹配的子字符串跨越多行,您需要匹配搜索词之间的任意文本的表达式部分以包含换行符(例如 [\s\S] 以匹配任何空格或非空格字符).您可能还希望匹配非贪婪,否则您将获得从 startTerm 的第一次出现到 endTerm 的最后一次出现的单个匹配.这就是修饰符 *? 的用途.

Some characters in a regular expression have a special meanings (e.g. . matches any character except newlines), so you need to make sure that any such character in your start, mid and end terms is properly escaped (e.g. use \. for matching a literal dot). In case the substring you want to match spans more than one line you need those parts of the expression that match arbitrary text between your search terms to include newline characters (e.g. [\s\S] to match any whitespace or non-whitespace character). You may also want to make the match non-greedy, otherwise you'd get a single match from the first occurrence of startTerm to the last occurrence of endTerm. That's what the modifier *? is for.

这篇关于查找字符串中的所有字符串的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆