选择匹配正则表达式后的下一行 [英] Select the next line after match regex

查看:1219
本文介绍了选择匹配正则表达式后的下一行的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我目前正在使用扫描软件"Drivve Image"从每张纸中提取某些信息.该软件可以在需要时运行某些Regex代码.它似乎与UltraEdit Regex Engine一起运行.

I'm currently using a scanning software "Drivve Image" to extract certain information from each paper. This software enables certain Regex code to be run if needed. It seems to be run with the UltraEdit Regex Engine.

我得到以下扫描结果:

 1. 21Sid1
 2. Ordernr
 3. E17222
 4. By
 5. Seller

我需要在字符串中搜索文本Ordernr,然后选择以下行E17222,最后将其表示为扫描文档的文件名.我永远不会知道这两个值在字符串中的确切位置.这就是为什么我需要专注于Ordernr的原因,因为我需要的文本将始终作为下一行.

I need to search the string for the text Ordernr and then pick the following line E17222 which in the end will be said filename of the scanned document. I will never know the exact position of these two values in the string. That is why I need to focus on Ordernr because the text I need will always follow as the next line.

我的要求是,我需要E17222作为匹配结果中的唯一内容,才能起作用.我只允许输入纯正则表达式.

My requirements are such that I need the E17222 to be the only thing in the match result for this to work. I am only allowed to type plain regular expressions.

已经有了一个不错的线程:使用正则表达式来获取后面的单词匹配的字符串

There is a great thread already: Regex to get the words after matching string

我已经测试了" \ bOrdernr \ s + \ K \ S + ",效果很好.

I've tested " \bOrdernr\s+\K\S+ "which works great..

不是因为该软件不允许使用/K.还有其他实现\ K的方法吗?

Had it not been that the software don't allow for /K to be used. Are there any other ways of implementing \K?

继续

Continuation

但是,如果示例文本中包含"Ordernr"后面的字符,则当前答案在我需要的范围内不起作用.像这个样本一样:

Though If the sample text involves a character behind "Ordernr" the current answer doesn't work to the extent I need. Like this sample:

21Sid1

Ordernr 1

Ordernr 1

E17222

通过

卖方

当前解决方案选择的是"1",而不是"下一行",即" E17222 ".在匹配的组中.需要指出这一点,以便进一步介入该问题.

The current solution picks up "1" and not the "next line" which would be "E17222". in the matched group. Needed to point that out for further involvement on the issue.

推荐答案

进行了一些谷歌搜索,据我所知,REGEXP.MATCH的最后一个参数是要使用的捕获组.这意味着您可以使用自己的正则表达式,而无需\K,而只需将捕获组添加到要提取的数字中即可.

Did some googling and from what I can grasp, the last parameter to the REGEXP.MATCH is the capture group to use. That means that you could use you own regex, without the \K, and just add a capture group to the number you want to extract.

 \bOrdernr\s+(\S+)

这意味着该数字最终出现在捕获组1中(整个匹配在0中,我认为您已经使用过).

This means that the number ends up in capture group 1 (the whole match is in 0 which I assume you've used).

文档尚不清楚,但是我想语法是

The documentation isn't crystal clear, but I guess the syntax is

REGEXP.MATCH(<ZoneName>, "REGEX", CaptureGroup)

表示您应该使用

REGEXP.MATCH(<ZoneName>, "\bOrdernr\s+(\S+)", 1)

虽然这里有很多猜测...;)

There's a fair amount of guessing here though... ;)

这篇关于选择匹配正则表达式后的下一行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆