如何修复此 wiki 链接解析正则表达式? [英] How can I fix this wiki link parsing regular expression?

查看:40
本文介绍了如何修复此 wiki 链接解析正则表达式?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个旧的 wiki,我正在将它转换为一个使用 Markdown 和 [[]] wiki 链接格式的新 wiki.不幸的是,旧的 wiki 真的 很旧,并且有很多 生成链接的方法,包括.CamelCase、单括号 ([]) wiki 链接等.

I've got an old wiki that I'm converting to a new wiki which uses Markdown and [[]] wiki link format. Unfortunately, the old wiki is really old and had many ways of producing links, incl. CamelCase, single-bracket ([]) wiki links, among others.

我正在转换 sed 中的正则表达式,并使用以下正则表达式将独立的 CamelCase 链接转换为双括号 ([[]])维基链接:

I'm converting w/regular expressions in sed and use the following regular expression to convert stand-alone CamelCase links to double-bracket ([[]]) wiki links:

s/([^[|])([A-Z][a-z]+[A-Z][A-Za-z]+)([^]|])/\1\[\[\2\]\]\3/g

不幸的是,上面的一个问题(我试图不在现有的单括号 wiki 链接中转换 CamelCase,因为两者混合)是类似于 [BluetoothConnection|UsingBluetoothIndex]将被转换为 [BluetoothConnection|Using[[BluetoothInde]]x].

Unfortunately, the one problem with the above (in my attempt to not convert CamelCase in existing single-bracket wiki links, since there's a mix of both) is that something like [BluetoothConnection|UsingBluetoothIndex] will get converted to [BluetoothConnection|Using[[BluetoothInde]]x].

在这种情况下,我该如何解决这个问题并强制匹配更加贪婪从而失败而不进行替换?如果 sed 的增强正则表达式被证明过于局限,我愿意通过 perl 而不是 sed.

How can I resolve this issue and force the match to be more greedy and therefore fail and not make a substitution in that case? If sed's enhanced regular expressions turn out to be too limiting, I'm willing to pass through perl instead of sed.

推荐答案

好吧,你试试这个:

$ echo "UsingBluetoothIndex" | sed -E 's!([^\[\|]?)([A-Z][a-z]+[A-Z][A-Za-z]+)($|\b|[]|])!\1\[\[\2\]\]\3!g'
Output: [[UsingBluetoothIndex]]

$ echo "[BluetoothConnection|UsingBluetoothIndex]" | sed -E 's!([^\[\|]?)([A-Z][a-z]+[A-Z][A-Za-z]+)($|\b|[]|])!\1\[\[\2\]\]\3!g'
Output: [[[BluetoothConnection]]|[[UsingBluetoothIndex]]]

更新:

好吧,我相信现在我可以使用 perl 对指令背后的负面看法.所以这里是:

Alright I believe now I have regex for your problem using perl's negative look behind directive. So here it is:

perl -pe 's#(^|\b)((?![|\[])[A-Z][a-z]+[A-Z][A-Za-z]+(?![|\]]))($|\b)#\[\[\2\]\]#g'

echo "BluetoothConnection" | perl -pe 's#(^|\b)((?![|\[])[A-Z][a-z]+[A-Z][A-Za-z]+(?![|\]]))($|\b)#\[\[\2\]\]#g'
Output: [[BluetoothConnection]]

echo "[BluetoothConnection|UsingBluetoothIndex]" | perl -pe 's#(^|\b)((?![|\[])[A-Z][a-z]+[A-Z][A-Za-z]+(?![|\]]))($|\b)#\[\[\2\]\]#g'
Output: [BluetoothConnection|UsingBluetoothIndex]

它所做的只是检查文本是否不是以|"开头或 '[' 且不以 |] 结尾,然后将其括在 [[]] 中.

All it is doing is checking if text is not starting with '|' or '[' and NOT ending with | or ] then enclose it in [[ and ]].

这篇关于如何修复此 wiki 链接解析正则表达式?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆