创建匹配的括号-awk:sed [英] Creating matching brackets- awk :sed
问题描述
我有一个具有三种模式的数据集:
I have a data set that has three patterns:
第一:
abrasion abrade:stem<>ion:suffix
abstainer abstain:stem<>er:suffix
abstention abstain:stem<>ion:suffix
第二:
inaccurate in:prefix<>accurate:stem
inactive in:prefix<>active:stem
第三:
incommunicable in:prefix<>communicate:stem<>able:suffix
incompatibility in:prefix<>compatible:stem<>ity:suffix
我需要将以上形式转换为以下形式:匹配宾夕法尼亚州树银行的方括号(
I need to convert the above to following form : Matching the brackets in the way for Penn Tree Bank (http://languagelog.ldc.upenn.edu/myl/PennTreebank1995.pdf)
第一:
abrasion ((abrade:stem) ion:suffix)
abstainer ((abstain:stem)er:suffix)
abstention ((abstain:stem)ion:suffix)
第二:
inaccurate (in:prefix(accurate:stem))
inactive (in:prefix(active:stem))
第三:
incommunicable (in:prefix ((communicate:stem)able:suffix))
incompatibility (in:prefix ((compatible:stem)ity:suffix))
我正在工作的代码正在使用awk
The code, I am working is using awk
{
n = gsub(/<>/,")",$2)
s = sprintf("%*s",n,"")
gsub(/ /,"(",s)
print "(" $1, s "((" $2 "))"
}
编辑
更复杂的表格
nationalistic national: stem <>ism:suffix<>ist:suffix<>ic:suffix
收件人:
nationalistic ((((national: stem) ism:suffix)ist:suffix)ic:suffix)
没有产生示例中提到的预期输出.
It is not producing the expected outputs that mentioned in the examples.
推荐答案
这应该足够通用,因为它考虑了:stem
,:prefix
和:suffix
进行匹配:
This should be general enough as it takes into account :stem
, :prefix
, and :suffix
for matching:
awk 'BEGIN{FS=OFS="\n"}{
a=gensub(/([a-zA-Z]*):stem/,"(\\1:stem)", "g");
b=gensub(/(\([a-zA-Z]*:stem\))<>([a-zA-Z]*):suffix/,"(\\1\\2:suffix)", "g", a);
c=gensub(/([a-zA-Z]*:prefix)<>(.*)/,"(\\1\\2)", "g", b);
print c;}' testfile
此处演示: https://ideone.com/U3ux91
编辑
这应该照顾多个后缀和前缀:
This should take care of multiple suffixes and prefixes:
awk 'BEGIN{FS=OFS="\n"}{
a=gensub(/([a-zA-Z]*):stem/,"(\\1:stem)", "g");
while ( a ~ /stem)<>.*:suffix/) {
a=gensub(/(\([a-zA-Z]*:stem\).*?)<>([a-zA-Z]*):suffix/,"(\\1\\2:suffix)", "g", a);
}
while ( a ~ /<>/) {
a=gensub(/([a-zA-Z]*?:prefix)<>(.*)/,"(\\1\\2)", "g", a);
}
print a;}' test
此处演示: https://ideone.com/U7LYXi (很抱歉,如果不是反民族主义,而是为了测试……)
Demo here: https://ideone.com/U7LYXi (sorry if antinationalistic is not a word, but for testing sake....)
这篇关于创建匹配的括号-awk:sed的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!