格式和过滤器文件到CSV表 [英] format and filter file to Csv table
本文介绍了格式和过滤器文件到CSV表的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我有一个包含许多日志文件:
诗:这个问题是从previous问题启发>。但略有改善。
在10:00卡尔1 STR0 STR1 STR2 STR3< STR4&STR5 GT; [STR6 STR7] STR8:
学院/ course1:oftheory:SMTGHO:什么:
学院/ course1:ofapplicaton:SMTGHP:1小时:10:00卡尔2 STR0 STR1 STR2 STR3< STR4 STR78> [STR6 STR111] STR8:
学院/ course2:oftheory:SMTGHM:数学:
学院/ course2:ofapplicaton:SMTGHN:2小时:10:00大卫1 STR0 STR1 STR2 STR3< STR4 STR758> [STR6 STR155] STR8:
学院/ course3:oftheory:SMTGHK:地理:
学院/ course3:ofapplicaton:SMTGHL:halfhour:10:00大卫2 STR0 STR1 STR2 STR3< STR4 STR87> [STR6 STR74] STR8:
学院/ course4:oftheory:SMTGH:SMTGHI:历史:
学院/ course4:ofapplicaton:SMTGHJ:什么:14:00卡尔1 STR0 STR1 STR2 STR3< STR4 STR11> [STR6 STR784] STR8:
学院/ course5:oftheory:SMTGHG:什么:
学院/ course5:ofapplicaton:SMTGHH:2小时:14:00卡尔2 STR0 STR1 STR2 STR3< STR4 STR86> [STR6 STR85] STR8:
学院/ course6:oftheory:SMTGHE:音乐:
学院/ course6:ofapplicaton:SMTGHF:2小时:14:00大卫1 STR0 STR1 STR2 STR3< STR4 STR96> [STR6 STR01] STR8:
学院/ course7:oftheory:SMTGHC:programmation:
学院/ course7:ofapplicaton:SMTGHD:1小时:14:00大卫2 STR0 STR1 STR2 STR3< STR4 STR335> [STR6 STR66] STR8:
学院/ course8:oftheory:SMTGHA:理念:
学院/ course8:ofapplicaton:SMTGHB:什么:
我曾尝试申请以下,但在白白code:
BEGIN {
用空行分隔#记录集
RS =
通过换行分隔#组字段,每个记录有3场
FS =\\ n
}
{
#删除记录的每个第一线部分意外
子(在,,$ 1)
#现在剩下的存储时间和过程
时间= $ 1
当然= $ 1
#从字符串中删除的时间来提取课程名称
子(^ [^] *,,当然)
#删除课程名称检索字符串时
子(当然,,时间)
#得到每个记录的第二线理论信息
子(课程:理论:,,$ 2)
#得到三线应用程序信息
子(课程:一个应用,,$ 3)
#如果新课程
如果(!(以头课程)){
#保存标头信息(输出每行的第一个词)
头[当然] =当然
理论[当然] =理论
应用[当然] =应用程序
}
#追加相关信息,以输出字符串
头[当然] =标题[当然],时间
理论[当然] =理论[当然],$ 2
应用[当然] =应用[当然],$ 3}
结束 {
#现在每个过程中发现
对(在头键){
#构造打印字符串
打印头[关键]
打印理论[关键]
打印应用[关键]
打印
}
反正是有得到这些字符串STR *和* SMTGH一骑为了得到这样的输出:
卡尔1,10:00,14:00
一个应用,halfhour,1小时
理论,地理,programmation卡尔2,10:00,14:00
一个应用,没什么,没什么
理论,历史,哲学大卫1,10:00,14:00
一个应用,1小时,2小时
理论上讲,没什么,没什么大卫2,10:00,14:00
一个应用,2小时,2小时
理论,数学,音乐
解决方案
GNU AWK
的awk -F:-v OFS =
/ ^ AT / {
拆分($ 0楼)
时间= F [2]
当然= F [3],F [4]
次[当然] =倍[当然] OFS时间
}
$ 2 ==oftheory{日[当然] =第[当然] OFS $(NF-1)}
$ 2 ==ofapplicaton{AP [当然] = AP [当然] OFS $(NF-1)}
结束 {
PROCINFO [sorted_in] =@ind_str_asc
为(在C时代){
printf的%s%S \\ n,C,倍[C]
printf的应用程序%S \\ n,美联社[C]
printf的学说%S \\ N个[C]
打印
}
}
'文件
卡尔1,10:00,14:00
应用程序,1小时,2小时
理论上讲,没什么,没什么卡尔2,10:00,14:00
应用程序,2小时,2小时
理论,数学,音乐大卫1,10:00,14:00
应用程序,halfhour,1小时
理论,地理,programmation大卫2,10:00,14:00
应用程序,没什么,没什么
理论,历史,哲学
I have a file that contains many logs :
Ps: the question is inspired from a previous question here. but slightly improved.
at 10:00 carl 1 STR0 STR1 STR2 STR3 <STR4 STR5> [STR6 STR7] STR8:
academy/course1:oftheory:SMTGHO:nothing:
academy/course1:ofapplicaton:SMTGHP:onehour:
at 10:00 carl 2 STR0 STR1 STR2 STR3 <STR4 STR78> [STR6 STR111] STR8:
academy/course2:oftheory:SMTGHM:math:
academy/course2:ofapplicaton:SMTGHN:twohour:
at 10:00 david 1 STR0 STR1 STR2 STR3 <STR4 STR758> [STR6 STR155] STR8:
academy/course3:oftheory:SMTGHK:geo:
academy/course3:ofapplicaton:SMTGHL:halfhour:
at 10:00 david 2 STR0 STR1 STR2 STR3 <STR4 STR87> [STR6 STR74] STR8:
academy/course4:oftheory:SMTGH:SMTGHI:history:
academy/course4:ofapplicaton:SMTGHJ:nothing:
at 14:00 carl 1 STR0 STR1 STR2 STR3 <STR4 STR11> [STR6 STR784] STR8:
academy/course5:oftheory:SMTGHG:nothing:
academy/course5:ofapplicaton:SMTGHH:twohours:
at 14:00 carl 2 STR0 STR1 STR2 STR3 <STR4 STR86> [STR6 STR85] STR8:
academy/course6:oftheory:SMTGHE:music:
academy/course6:ofapplicaton:SMTGHF:twohours:
at 14:00 david 1 STR0 STR1 STR2 STR3 <STR4 STR96> [STR6 STR01] STR8:
academy/course7:oftheory:SMTGHC:programmation:
academy/course7:ofapplicaton:SMTGHD:onehours:
at 14:00 david 2 STR0 STR1 STR2 STR3 <STR4 STR335> [STR6 STR66] STR8:
academy/course8:oftheory:SMTGHA:philosophy:
academy/course8:ofapplicaton:SMTGHB:nothing:
I have tried to apply the code below but in vain :
BEGIN {
# set records separated by empty lines
RS=""
# set fields separated by newline, each record has 3 fields
FS="\n"
}
{
# remove undesired parts of every first line of a record
sub("at ", "", $1)
# now store the rest in time and course
time=$1
course=$1
# remove time from string to extract the course title
sub("^[^ ]* ", "", course)
# remove course title to retrieve time from string
sub(course, "", time)
# get theory info from second line per record
sub("course:theory:", "", $2)
# get application info from third line
sub("course:applicaton:", "", $3)
# if new course
if (! (course in header)) {
# save header information (first words of each line in output)
header[course] = course
theory[course] = "theory"
app[course] = "application"
}
# append the relevant info to the output strings
header[course] = header[course] "," time
theory[course] = theory[course] "," $2
app[course] = app[course] "," $3
}
END {
# now for each course found
for (key in header) {
# print the strings constructed
print header[key]
print theory[key]
print app[key]
print ""
}
Is there anyway to get a ride of these strings STR* and SMTGH* in order to get this output:
carl 1,10:00,14:00
applicaton,halfhour,onehours
theory,geo,programmation
carl 2,10:00,14:00
applicaton,nothing,nothing
theory,history,philosophy
david 1,10:00,14:00
applicaton,onehour,twohours
theory,nothing,nothing
david 2,10:00,14:00
applicaton,twohour,twohours
theory,math,music
解决方案
GNU awk
awk -F: -v OFS=, '
/^at/ {
split($0, f, " ")
time = f[2]
course = f[3] " " f[4]
times[course] = times[course] OFS time
}
$2 == "oftheory" {th[course] = th[course] OFS $(NF-1)}
$2 == "ofapplicaton" {ap[course] = ap[course] OFS $(NF-1)}
END {
PROCINFO["sorted_in"] = "@ind_str_asc"
for (c in times) {
printf "%s%s\n", c, times[c]
printf "application%s\n", ap[c]
printf "theory%s\n", th[c]
print ""
}
}
' file
carl 1,10:00,14:00
application,onehour,twohours
theory,nothing,nothing
carl 2,10:00,14:00
application,twohour,twohours
theory,math,music
david 1,10:00,14:00
application,halfhour,onehours
theory,geo,programmation
david 2,10:00,14:00
application,nothing,nothing
theory,history,philosophy
这篇关于格式和过滤器文件到CSV表的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文