使用grep和awk将数据从.SRT转移到.csv / XLS [英] use grep and awk to transfer data from .srt to .csv/xls

查看:389
本文介绍了使用grep和awk将数据从.SRT转移到.csv / XLS的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我得到了一个有趣的项目呢!我在想转换一个SRT文件到CSV / xls文件。

一个SRT文件是这样的:

  1
00:00:00104 - > 00:00:02669
我是的shell脚本。2
00:00:02982 - > 00:00:04965
我不知道是否会工作,
但我会努力的!3
00:00:05085 - > 00:00:07321
必须有一个办法做到这一点!

而我想输出它变成这样一个CSV文件:

 1,00:00:00104,00:00:02669,你好,我是的shell脚本。
2,00:00:02982,00:00:04965,我不知道这是否会工作
,,,但我会努力的!
3,00:00:05085,00:00:07321,!一定有办法做到这一点

因此​​,大家可以看到,每个字幕占用两行。我的想法是使用grep把SRT数据到XLS,然后用awk格式化xls文件。

你们觉得呢?我怎么想写呢?我想结果

  $ grep的filename.srt> filename.xls

似乎所有包括时间codeS和字幕字的数据结束了所有XLS文件中的A列...但我想的话是在B列...怎么会是AWK能够帮助格式?

感谢你在前进! :)


解决方案

  $猫tst.awk
BEGIN {RS =; FS =\\ n; OFS =,; Q =\\; S = Q OFS Q}
{
    拆分($ 2,,/ * /)
    打印q $ 1秒[1]是个[2] S $三问
    对于(i = 4; I< = NF;我++){
        打印,,,Q $ I Q
    }
}$ AWK -f tst.awk文件
1,00:00:00104,00:00:02669,你好,我是的shell脚本。
2,00:00:02982,00:00:04965,我不知道这是否会工作,
,,,但我会努力的!
3,00:00:05085,00:00:07321,!一定有办法做到这一点

I got an interesting project to do! I'm thinking about converting an srt file into a csv/xls file.

a srt file would look like this:

1
00:00:00,104 --> 00:00:02,669
Hi, I'm shell-scripting.

2
00:00:02,982 --> 00:00:04,965
I'm not sure if it would work,
but I'll try it!

3
00:00:05,085 --> 00:00:07,321
There must be a way to do it!

while I want to output it into a csv file like this:

"1","00:00:00,104","00:00:02,669","Hi, I'm shell-scripting."   
"2","00:00:02,982","00:00:04,965","I'm not sure if it would work"
,,,"but I'll try it!"
"3","00:00:05,085","00:00:07,321","There must be a way to do it!"

So as you can see, each subtitle takes up two rows. My thinking would be using grep to put the srt data into the xls, and then use awk to format the xls file.

What do you guys think? How am I suppose to write it? I tried

$grep filename.srt > filename.xls

It seems that all the data including the time codes and the subtitle words ended up all in column A of the xls file...but I want the words to be in column B...How would awk be able to help with the formatting?

Thank you in advance! :)

解决方案

$ cat tst.awk
BEGIN { RS=""; FS="\n"; OFS=","; q="\""; s=q OFS q }
{
    split($2,a,/ .* /)
    print q $1 s a[1] s a[2] s $3 q
    for (i=4;i<=NF;i++) {
        print "", "", "", q $i q
    }
}

$ awk -f tst.awk file
"1","00:00:00,104","00:00:02,669","Hi, I'm shell-scripting."
"2","00:00:02,982","00:00:04,965","I'm not sure if it would work,"
,,,"but I'll try it!"
"3","00:00:05,085","00:00:07,321","There must be a way to do it!"

这篇关于使用grep和awk将数据从.SRT转移到.csv / XLS的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆