逃脱成角度的支架的行为类似于前瞻 [英] Escaping Angled Bracket acts similar to look-ahead
问题描述
为什么转义转义尖括号>
会表现出前瞻性的行为?
Why does escaping escaping the angled bracket >
exhibit the look-ahead like behavior?
为清楚起见,我了解到尖括号不需要将其转义.
问题是,如何解释该模式以产生所显示的匹配项?
To be clear, I understand that the angled bracket does not necessitate being escaped.
The question is, how is the pattern being interpreted that it yields the match(es) shown
## match bracket, with or without underscore
## replace with "greater_"
strings <- c("ten>eight", "ten_>_eight")
repl <- "greater_"
## Unescaped. Yields desired results
gsub(">_?", repl, strings)
# [1] "tengreater_eight" "ten_greater_eight"
## All four of these yield the same result
gsub("\\>_?", repl, strings) # (a)
gsub("\\>(_?)", repl, strings) # (b)
gsub("\\>(_)?", repl, strings) # (c)
gsub("\\>?", repl, strings) # (d)
gsub("\\>", repl, strings) # (e)
# [1] "tengreater_>eightgreater_" "ten_greater_>_eightgreater_"
gregexpr("\\>?", strings)
一些后续问题:
1. Why do `(a)` and `(d)` yield the same result?
2. Why is the end-of-string matched?
3. Why do none of `a, b, or c` match the underscore?
推荐答案
\\>
是单词边界(在左侧的单词字符和右侧的非单词字符或末尾匹配)线锚$
的位置.
\\>
is a word boundary Which matches between a word character(in the left side) and a non-word character (in the right side) or end of the line anchor $
.
> strings <- c("ten>eight", "ten_>_eight")
> gsub("\\>", "greater_", strings)
[1] "tengreater_>eightgreater_" "ten_greater_>_eightgreater_"
在上面的示例中,它仅匹配n
之后的单词字符和非单词字符>
之间的单词边界,然后匹配t
与第一个元素中的行锚末端之间的边界.并且它在_
(也是单词字符)和>
之间匹配,然后在t
和第二个元素中的行锚结尾(即$
)之间匹配.最后,它将匹配的边界替换为您指定的字符串.
In the above example it match only the word boundary exists between a word character after n
and a non-word character >
then also the boundary between t
and end of the line anchor in the first element. And it matches between _
(also a word character) and >
then between t
and end of the line anchor (ie, $
) in the second element. Finally it replaces the matched boundaries with the string you specified.
一个简单的示例:
> gsub("\\>", "*", "f:r(:")
[1] "f*:r*(:"
请考虑以下输入字符串. ( w
表示单词字符,N
表示非单词字符)
Consider the below input string. (w
means a word character, N
means a non-word character)
f:r(:
w___|||||
|w|N
N |
|
N
所以\\>
之间匹配,
-
f
和:
-
r
和(
f
and:
r
and(
示例2:
> gsub("\\>", "*", "f")
[1] "f*"
输入字符串:
f$
||----End of the line anchor
w
用*
替换匹配的边界将得到以上结果.
Replacing the matched boundary with *
will give the above result.
这篇关于逃脱成角度的支架的行为类似于前瞻的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!