csv的特殊时间戳格式 [英] special timestamp format of csv
问题描述
我有一个格式的csv时间戳数据:
8.11.2012 16:15:10
21.11。 2012 15:00:54
11.11.2012 0:24:24
8.11.2012 16:06:53
9.11.2012 0:49:37
我要在每个
$ b $上应用不带单个数字的特殊时间戳格式: b
08_11_2012_16_15_10
21_11_2012_15_00_54
11_11_2012_00_24_24
08_11_2012_16_06_53
我尝试过regex,搜索和替换,但得到了:
8_11_2012_16_15_10
21_11_2012_15_00_54
11_11_2012_0_24_24
8_11_2012_16_06_53
有没有其他建议,可能与shell awk?
解决方案你可以做两遍。查找数据文件中从不出现的字符或短序列字符。我将在这里使用
=#=
。第一遍然后非常类似于你已经尝试,但添加=#= 0
之前所有应该转换为两位数字的数字。因此8.11.2012 16:15:10
更改为=#= 08 _ =#= 011_2012 _ =#= 016 _ =#= 015 _ =# = 010
。第二遍将使用正则表达式搜索=#= 0 *(\d\d [])删除
并替换为=#=
^ \d])\0
。
文件只包含日期和时间,那么您可以在执行您已经尝试的更改之前将前导零添加到文本中。正则表达式搜索
\b(\d)\b
并替换为0 \1
将任何一位数字转换为两位数字。请注意,\b(\d)\b
不会将_6 _
code> \b 搜索单词边界,_
被认为是单词的一部分。尝试搜索([^ \d])(\d)([^ \d])
并替换为\10 \2 \3
无法正常工作,因为它可能无法处理行或文件的开头和结尾,也需要运行两次来处理6.5。 2013
I have a csv timestamp data of format:
8.11.2012 16:15:10 21.11.2012 15:00:54 11.11.2012 0:24:24 8.11.2012 16:06:53 9.11.2012 0:49:37
I want to apply special timestamp format like this without single digit on each:
08_11_2012_16_15_10 21_11_2012_15_00_54 11_11_2012_00_24_24 08_11_2012_16_06_53
I have tried with regex, search and replace, but got this:
8_11_2012_16_15_10 21_11_2012_15_00_54 11_11_2012_0_24_24 8_11_2012_16_06_53
Does anyone have another idea, maybe with shell awk?
解决方案You could do it in two passes. Find a character or short sequence of characters that never occurs in your data file. I will use
=#=
here. The first pass is then very similar to what you have already tried but add=#=0
before all the numbers that should be converted into two digit numbers. So the8.11.2012 16:15:10
is changed to=#=08_=#=011_2012_=#=016_=#=015_=#=010
. Second pass would remove the=#=
and the unwanted zeros using a regular expression search for=#=0*(\d\d[^\d])
and replace with\0
.If the file only contains dates and times then you might be able to add the leading zeroes into the text before doing the change that you have already tried. A regular expression search for
\b(\d)\b
and replace with0\1
would convert any single digit to two digits. Note that the\b(\d)\b
will not see_6_
as a single digit as\b
searches for word boundaries and_
is considered to be part of a word. Trying to search for([^\d])(\d)([^\d])
and replace with\10\2\3
does not work well because it may not handle start and end of line or file as wanted, also it would need to be run twice to process6.5.2013
这篇关于csv的特殊时间戳格式的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!