字符串操作KNIME中的regexMatcher [英] regexMatcher in String Manipulation KNIME

查看:238
本文介绍了字符串操作KNIME中的regexMatcher的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试使用KNIME中的String Manipulation中的regexMatcher,但是它不起作用.我在写以下内容: regexMatcher($ Document $,"/\ w") 当我想提取所有具有/s或/p或w/p或/200的句子时.但是,即使我的表中有这种情况,也不会检索到任何东西.感谢您的帮助.

I'm trying to use regexMatcher from String Manipulation in KNIME but it doesn't work. I'm writing the following: regexMatcher($Document$,"/\w") when I want to extract all sentences that have /s or /p or w/p or /200. However even though I have such cases in my table nothing is retrieved. I will appreciate your help.

推荐答案

我得到了以下内容:

|Document      |isOK |other|strict|
|--------------|-----|-----|------|
|Some /p with q|True |False|False |
|/200          |True |True |False |
|/p            |True |True |True  |
|/s            |True |True |True  |
|w/p           |True |False|False |
|no slash      |False|False|False |

对于表达式:

  • 是的:regexMatcher($Document$, ".*?/\\w.*")(我想这就是您要的).
  • 其他:regexMatcher($Document$, "/\\w.*")
  • 严格:regexMatcher($Document$, "/\\w")
  • isOK: regexMatcher($Document$, ".*?/\\w.*") (I guess this is what you are after.)
  • other: regexMatcher($Document$, "/\\w.*")
  • strict: regexMatcher($Document$, "/\\w")

(文档中最后一个可见字符之后不包含任何内容.)

(Document contains no content after the last visible character.)

您可能会遇到的问题是转义字符串操纵器节点和regexMatcher的语义.

The problem you might run into is the escaping for the string manipulator node and the semantics of regexMatcher.

其中的String文字只有一个Java String,因此您必须转义\(和其他一些字符),因此它变为\\.

The String literal within there is just a Java String, so you have to escape the \ (and some other characters), so it becomes \\.

regexMatcher的语义是匹配整个String,因此您必须在要查找的值之前添加.*?(非贪婪匹配任何内容),在表达式之后添加.*(贪婪匹配任何内容)您正在寻找. (显然,如果我误解了您的问题,那么语义可能已经是您想要的.)

The semantics of regexMatcher is to match the whole String, so you have to add .*? (non-greedy match anything) before the value you are looking for and .* (greedy match anything) after the expression you are looking for. (Obviously if I misunderstood your question, the semantics is probably already is what you want.)

BTW:如果要过滤,则应检查

BTW: in case you want to filter, you should check the Rule-based Row Filter node as it offers an option to directly filter by regex. It uses a different escaping rule (for the isOK option):

  • $Document$ MATCHES ".*?/\w.*" => TRUE(不允许在引号中转义)
  • $Document$ MATCHES /.*?\/\\w.*/ => TRUE(在斜杠内允许转义(并且/\必须转义,但不需要"))
  • $Document$ MATCHES ".*?/\w.*" => TRUE (escaping is not allowed within quotes)
  • $Document$ MATCHES /.*?\/\\w.*/ => TRUE (escaping is allowed within slashes (and /, \ are need to be escaped, but " is not required))

这篇关于字符串操作KNIME中的regexMatcher的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆