需要帮助来捕获带有正则表达式的SSDeep字符串 [英] Need help to capture SSDeep string with regex
问题描述
我的主题很好,这里的问题我有一个在某些情况下有效的正则表达式,但它有时会返回其他不需要的值,这不是我想要看到的。我希望有一个更好或更准确的表达式,我可以用它来捕获有问题的SSDEEP字符串。
这是html代码我希望捕获字符串来自
My subject is pretty well the question here I have a regular expression that works in some cases but it returns other undesired values at times which is not what I want to see. I hope there is a better or more precise expression I can use to capture the SSDEEP string in question.
Here is the html code which I wish to capture the string from
<div class="floated-field-value">768:pHC0p5mwel+twV39TD8mRF5rKJZsF6No2:o0p5mwelJ9TD8mv5ImGo</div>
我正在处理的正则表达式看起来像这样
the regular expression I am working on looks like this
Dim SSDEEP As New Regex("(?<=<div class=""floated-field-value"">)([^\""]+)(</div>)", RegexOptions.IgnoreCase)
我似乎只能用
I can only seem to get it close with
</div>
仍然保留在字符串的末尾,所以我用一些代码排除(div)字符串
still remaining on the end of the string so I excluded ("div") off the string with some code
For X = 0 To RichTextBox3.Lines.Length - 2
Dim MyString As String = RichTextBox3.Lines(X).ToString
Label28.Text = MyString
next
我希望这对某人来说足够了他lp me
提前谢谢!!
I hope this is enough for someone to help me
thank you in advance!!
推荐答案
好的。
而不是试图获得标签,得到数据的模式。
输入字符串:
Ok got it.
Instead of trying to get the tags, get the pattern of the data.
Input string:
<div class="floated-field-value">768:pHC0p5mwel+twV39TD8mRF5rKJZsF6No2:o0p5mwelJ9TD8mv5ImGo</div>
Regx1使用:
Regx1 used:
((\d{3}):(\w*)\+(\w*):(\w*))
Regx2使用:
Regx2 used:
((\d{3}):(\w*):(\w*)|(\d{3}):(\w*)\+(\w*):(\w*))
Regx3使用:
Regx3 used:
((\d*):(\w*):(\w*)|(\d*):(\w*)\+(\w*):(\w*))
输出:
Output:
768:pHC0p5mwel+twV39TD8mRF5rKJZsF6No2:o0p5mwelJ9TD8mv5ImGo
2外()包含搜索在解析网站时不确定是否需要它们。
(\d {3})查找三个数字
:下一个字符串
(\ w *)任意长度的字母数字
\ +逃避加号并寻找下一个加号
(\ w *)任意长度的字母数字
:那个char下一个
(\ w *)最后一个字要提取
多数民众赞成就像我说的不确定它可以在真实网站上运行。
只要所有数据值都包含+,它就应该工作,否则需要为该类型修改它。就像一个Or语句,它不使用+,但大多数都是相同的。
它可以在一个小的测试应用程序中运行。
我希望这不是你的作业:)
编辑:
查看SSDEEP后我测试了另外2个Regx添加。
第二个用于捕获+是否存在。
在审查SSDEEP后的第三个部分可能会更长3个字符,所以我修复了它可以获得任意长度的数字。
最好的我可以告诉2外面的()需要在那里匹配整个模式。
the 2 outer "()" contains the search terms.Not sure if they are needed when parsing a site or not.
"(\d{3})" looks for three numbers
":" that char next
"(\w*)" alphanumeric word of any length
"\+ escape the plus and look for the plus sign next
"(\w*)" alphanumeric word of any length
":" that char next
"(\w*)" last word to extract
Thats it like I said not sure how it would work on a real site.
It should work as long as all data values contain a "+" otherwise it would need to be modified for that type. like an "Or" statement that dosen't use the "+" in it but most everthing else the same.
It does work in a small test app.
I hope this is not your homework :)
After looking up what SSDEEP is I tested the other 2 Regx added.
the second one is for catching if the "+" is there or not.
The third one after a review of SSDEEP the first section could be longer the 3 Char's so I fixed it to get any length of digits.
The best I can tell the 2 outside "()" would need to be there to match the entire pattern.
这篇关于需要帮助来捕获带有正则表达式的SSDeep字符串的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!