使用findText在Google App脚本(documentApp)中使用正则表达式拆分文本 [英] splitting up text with a regular expression in google app script (documentApp) with findText
问题描述
在Google文档(而不是电子表格)中,有一堵看起来像
In a google doc (not a spreadsheet) I have a wall of text that looks like
°foo bar header°foo bar bat paragraph°and another paragraph°and yet an other paragraph°and so on
,我想将文本分成几段.
and I want to split up the text into paragraphs.
所以我想借助正则表达式在°字符之间获取文本.我想使用
So I would like to get the text between the ° chars with help of a regexp. I would like to use
var rangeElement = body.findText("°([^°]*)°");
但是该正则表达式使Google文档变为服务不可用:文档".使用像"°.?°"
这样的正则表达式可以缓解该问题,但不能隔离我想要的文本.
but that regexp makes google docs go "Service unavailable: Docs". Using a regexp like "°.?°"
alleviates that problem, but does not isolate the text I want.
什么是有效的正则表达式?如何从(
和)
内部处理子字符串?
What is a regexp that would work? How can I proceed to process the substring from within the (
and )
?
推荐答案
要获取所有(多个)匹配项,可以将JavaScript RegExp
与g
(全局)修饰符一起使用.
To get all (multiple) matches, you can leverage the JavaScript RegExp
witha g
(global) modifier.
var rx = /°([^°]*)/g;
while (m=rx.exec(doc.getBody().getText())) {
Logger.log("Matched: " + m[1]);
}
/°([^°]*)/g
是 regex文字表示法(请注意不要在其周围加上引号),其中/
是regex分隔符,°([^°]*)
是与°
符号匹配的模式,然后将任意数量(g
)以外的0个或多个除°
(([^°]*)
)以外的字符捕获到组1中.
The /°([^°]*)/g
is a regex literal notation (mind there should be no quotes around it), where /
are regex delimiters, °([^°]*)
is the pattern matching °
symbol and then capturing into Group 1 any 0 or more chars other than °
(([^°]*)
) any number of times (g
).
要真正匹配所有出现的事件,您需要多次调用RegExp#exec
并获取组1值(m[1]
).
To actually match all the occurrences, you need to call RegExp#exec
multiple times and grab Group 1 value (m[1]
).
这篇关于使用findText在Google App脚本(documentApp)中使用正则表达式拆分文本的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!