在awk中进行元编程,将文件转换为html表格式 [英] Metaprogramming in awk, convert file to html table format
问题描述
我有以下文件:
table.txt(逗号分隔)
table.txt (comma separate)
1,Example Title
COL1,COL2,COL3,COL4,COL5
BRCC,ACGC,15869,105A,1
BCAS,GAAG,73345,369T,2
template.awk
template.awk
function parse_print(s){
s = gensub(/^\s+|\s+$/,"","g",s)
s = gensub(/[\42]/,"\\\\\042","g",s)
s = gensub(/\$[0-9]+/,"\" & \"","g",s)
s = gensub(/\$e/,"\" & \"","g",s)
return s;
}
/^[^%]/{print "print \"" parse_print($0) "\""; next}
/^%BEGIN$|^%END$/{print substr($1,2) "{"; next}
/^%END.+$/{print "}"; next}
{print substr($1,2) "{"}
{
if($2 == "%FOREACH"){
pprint = gensub(/(\S+\s+){2}(.*)/,"\\2","g")
print "for(e=1; e<=NF; ++e) print \"" parse_print(pprint) "\""
}else{
pprint = gensub(/\S+\s+(.*)/,"\\1","g")
print "print \"" parse_print(pprint) "\""
}
}
{print "}"}
table.tawk
table.tawk
%BEGIN
<style>
.my_table {border-bottom:3px double black; border-collapse: collapse; }
.my_table tr.header{border-bottom:3px double black;}
.my_table td{text-align: center;}
</style>
<table class="my_table">
%ENDBEGIN
%NR==1 <caption>Table $1. $2</caption>
%NR==2 <tr class="header">
%NR>2 <tr>
%NR==2 %FOREACH <th>$e</th>
%NR>2 %FOREACH <td>$e</td>
%NR!=1 </tr>
%END
</table>
%ENDEND
metaprogramming.sh
metaprogramming.sh
#!/bin/sh
# metaprogram
awk '@include "template"' $1 > .table.awk
awk -vFS="," -f .table.awk $2
rm .table.awk
这个想法是使用元编程来分离表示的逻辑,这是基于如何在html,使用awk的问题将文本文件转换为html表格式.
The idea was to use metaprogramming to separate the logic of the presentation, this based in comment of @kent in How to format text in html using awk's question for to convert text file to html table format.
./metaprogramming.sh table.tawk table.txt > table.html
这会得到
<style>
.my_table {border-bottom:3px double black; border-collapse: collapse; }
.my_table tr.header{border-bottom:3px double black;}
.my_table td{text-align: center;}
</style>
<table class="my_table">
<caption>Table 1. Example Title</caption>
<tr class="header">
<th>COL1</th>
<th>COL2</th>
<th>COL3</th>
<th>COL4</th>
<th>COL5</th>
</tr>
<tr>
<td>BRCC</td>
<td>ACGC</td>
<td>15869</td>
<td>105A</td>
<td>1</td>
</tr>
<tr>
<td>BCAS</td>
<td>GAAG</td>
<td>73345</td>
<td>369T</td>
<td>2</td>
</tr>
</table>
问题1
是否有一种方法可以在不创建临时文件.table.awk
的情况下进行调用,甚至无需使用bash
脚本(awk
直接调用)?
Is there a way to do the calling without creating the temporary file .table.awk
, even, without to use bash
script (awk
direct calling) ?
问题奖金
有没有更好的方法? awk
中是否已经有一个库?
Is there a way to do this better? is there a library in awk
that already does this?
推荐答案
TXR是一种工具,它结合了原始的Lisp方言,为基于模板的数据提取和格式化提供了一种语言:
TXR is a tool which provides a language for template-based extraction and formatting of data, combined with an original Lisp dialect:
在format.txr
中,我们有:
@num,@title
@(coll)@{heading /[^,]+/}@(end)
@(collect)
@ (coll)@{data /[^,]+/}@(end)
@(end)
@(output :filter :tohtml)
<style>
.my_table {border-bottom:3px double black; border-collapse: collapse; }
.my_table tr.header{border-bottom:3px double black;}
.my_table td{text-align: center;}
</style>
<table class="my_table">
<caption>Table @num. @title</caption>
<tr class="header">
@ (repeat)
<th>@heading</th>
@ (end)
</tr>
@ (repeat)
<tr>
@ (repeat)
<td>@data</td>
@ (end)
</tr>
@ (end)
</table>
@(end)
我们将其像这样应用于data
文件:
We apply it to the data
file like this:
$ txr format.txr data
<style>
.my_table {border-bottom:3px double black; border-collapse: collapse; }
.my_table tr.header{border-bottom:3px double black;}
.my_table td{text-align: center;}
</style>
<table class="my_table">
<caption>Table 1. Example Title</caption>
<tr class="header">
<th>COL1</th>
<th>COL2</th>
<th>COL3</th>
<th>COL4</th>
<th>COL5</th>
</tr>
<tr>
<td>BRCC</td>
<td>ACGC</td>
<td>15869</td>
<td>105A</td>
<td>1</td>
</tr>
<tr>
<td>BCAS</td>
<td>GAAG</td>
<td>73345</td>
<td>369T</td>
<td>2</td>
</tr>
</table>
请注意,:filter :tohtml
会为HTML转义字符;例如,如果数据包含&
,则得到&
,依此类推.
Note that the :filter :tohtml
takes care of escaping characters for HTML; if the data contains &
for instance, we get &
and so on.
垂直collect
和水平coll
指令隐式地将匹配的模式变量支持到嵌套列表中. repeat
隐式解开列表,因此,像@data
这样的简单变量引用才会出现在输入匹配部分和输出中.
The vertical collect
and horizontal coll
directives implicitly shore up the matched pattern variables into nested lists; repeat
implicitly unwraps lists, so that just simple variable references like @data
appear in both the input matching section and in the output.
这是Vim下语法高亮显示的内容,在其中很清楚什么是模板材料,什么是TXR语法:
Here is what that looks like with syntax highlighting under Vim, under which it is very clear what is just template material and what is TXR syntax:
这篇关于在awk中进行元编程,将文件转换为html表格式的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!