过滤内容文件表 [英] filter a content file to table

查看：95 发布时间：2016/7/28 15:06:59 linux file awk sed

本文介绍了过滤内容文件表的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

这是我所产生的输入，显示的为Jany和Marco的课程版本在不同的时间。

this is the input I have generated , that displays the versions of courses for both Jany and Marco at different times .

on 10:00 the course of jany 1 is :
course:theory:nothing
course:applicaton:onehour

on 10:00 the course of jany 2 is :
course:theory:math
course:applicaton:twohour

on 10:00 the course of Marco 1 is :
course:theory:geo
course:applicaton:halfhour

on 10:00 the course of Marco 2 is :
course:theory:history
course:applicaton:nothing

on 14:00 the course of jany 1 is :
course:theory:nothing
course:applicaton:twohours

on 14:00 the course of jany 2 is :
course:theory:music
course:applicaton:twohours

on 14:00 the course of Marco 1 is :
course:theory:programmation
course:applicaton:onehours

on 14:00 the course of Marco 2 is :
course:theory:philosophy
course:applicaton:nothing

我成功地用awk命令对它进行排序：

using awk commands I succeeded to sort it :

awk -F '[\ :]' '/the course of/{h=$2;m=$3} /theory/{print " "h":"m" theory:"$3}' f.txt
awk -F '[\ :]' '/the course of/{h=$2;m=$3} /application/{print " "h":"m" application :"$3}' f.txt

10:00 theory:nothing
14:00 theory:nothing

10:00 application:onehour
14:00 application:twohours

现在我想通过添加如下所示的名称（jany，马可）和版本（1或2），以提高过滤器。

Now I would like to improve the filter by adding the names( jany,Marco) and the versions(1 or 2) as shown below .

Jany 1,10:00,14:00
theory,nothing,nothing
application,onehour,twohour

Jany 2,10:00,14:00
theory,math,music
application,twohour,twohour

Marco 1,10:00,14:00
theory,geo,programmation
application,halfhour,onehour

Marco 2,10:00,14:00
theory,history,philosoohy
application,nothing,nothing

我停留在如何提取的姓名，编号和获取是指他们的课程在排序和筛选表中的信息。

I am stuck on how to extract the 'name,number' and get the informations that refers to their courses in a sorted and filtered table.

推荐答案

试试这个：

BEGIN {
    # set records separated by empty lines
    RS=""
    # set fields separated by newline, each record has 3 fields
    FS="\n"
}
{
    # remove undesired parts of every first line of a record
    sub("the course of ", "", $1)
    sub(" is :", "", $1)
    sub("on ", "", $1)
    # now store the rest in time and course
    time=$1
    course=$1
    # remove time from string to extract the course title
    sub("^[^ ]* ", "", course)
    # remove course title to retrieve time from string
    sub(course, "", time)
    # get theory info from second line per record
    sub("course:theory:", "", $2)
    # get application info from third line
    sub("course:applicaton:", "", $3)
    # if new course
    if (! (course in header)) {
        # save header information (first words of each line in output)
        header[course] = course
        theory[course] = "theory"
        app[course] = "application"
    }
    # append the relevant info to the output strings
    header[course] = header[course] "," time
    theory[course] = theory[course] "," $2
    app[course] = app[course] "," $3

}
END {
    # now for each course found
    for (key in header) {
        # print the strings constructed
        print header[key]
        print theory[key]
        print app[key]
        print ""
}

我希望评论是自我解释，如果您有关于剧本的问题一定要问他们。

I hope the comments are self explanatory, if you have questions about the script be sure to ask them.

这篇关于过滤内容文件表的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

过滤内容文件表 [英] filter a content file to table

问题描述

推荐答案

相关文章

服务器开发最新文章

热门教程

热门工具

登录关闭

过滤内容文件表 [英] filter a content file to table

问题描述

推荐答案

相关文章

服务器开发最新文章

热门教程

热门工具

登录 关闭

登录关闭