如何在Perl中提取和解析带引号的字符串? [英] How do I extract and parse quoted strings in Perl?
问题描述
美好的一天.
我的文本文件内容如下. tmp.txt(非常大的文件)
My text file content below. tmp.txt (a very big size file)
constant fixup private AlarmFileName = <A "C:\\TMP\\ALARM.LOG"> /* A Format */
constant fixup ConfigAlarms = <U1 0> /* U1 Format */
constant fixup ConfigEvents = <U2 0> /* U2 Format */
下面的我的解析代码.
代码在这里无法处理C:\\TMP\\ALARM.LOG
(引用的字符串).
我不知道如何替换代码"s +([a-zA-Z0-9])+>"来处理[a-zA-Z0-9](上面的0)字符串和旧字符串("C:\ TMP \ ALARM.LOG).
My parse code below.
The code can't handle C:\\TMP\\ALARM.LOG
(quoted string) here.
I don't know how to replace the code "s+([a-zA-Z0-9])+>" to handle both the [a-zA-Z0-9] (0 above) string and the quated string ("C:\TMP\ALARM.LOG" above).
$source_file = "tmp.txt";
$dest_xml_file = "my.xml";
#Check existance of root directory
open(SOURCE_FILE, "$source_file") || die "Fail to open file $source_file";
open(DEST_XML_FILE, ">$dest_xml_file") || die "Coult not open output file $dest_xml_file";
$x = 0;
print DEST_XML_FILE "<!-- from tmp.txt-->\n";
while (<SOURCE_FILE>)
{
&ConstantParseAndPrint;
}
sub ConstantParseAndPrint
{
if ($x == 0)
{
if(/^\s*(constant)\s*(fixup|\/\*fixup\*\/|)\s*(private|)\s*(\w+)\s+=\s+<([a-zA-Z0-9]+)\s+([a-zA-Z0-9])+>\s*(\/\*\s*(.*?)\s*\*\/|)(\r|\n|\s)/)
{
$name1 = $1;
$name2 = $2;
$name3 = $3;
$name4 = $4;
$name5 = $5;
$name6 = $6;
$name7 = $7;
printf DEST_XML_FILE "\t\t$name1";
printf DEST_XML_FILE "\t\t$name2";
printf DEST_XML_FILE "\t\t$name3";
printf DEST_XML_FILE "\t\t$name4";
printf DEST_XML_FILE "\t\t$name5";
printf DEST_XML_FILE "\t\t$name6";
printf DEST_XML_FILE "\t\t$name7";
$x = 1;
}
}
}
谢谢您的输入.
**你好,
感谢这么多出色的解决方案.我是新手,我想根据您的帖子做更多的研究.
Thanks for so many great solutions. I am a newbew, i would like to do more study based on your post.
非常感谢.**
推荐答案
#!/usr/bin/perl
$source_file = "tmp.txt";
$dest_xml_file = "my.xml";
#Check existance of root directory
open(SOURCE_FILE, "$source_file") || die "Fail to open file $source_file";
open(DEST_XML_FILE, ">$dest_xml_file") || die "Coult not open output file $dest_xml_file";
$x = 0;
print DEST_CS_FILE "<!-- from tmp.txt-->\n";
while (<SOURCE_FILE>)
{
&ConstantParseAndPrint;
}
sub ConstantParseAndPrint
{
if ($x == 0)
{
# if(/^\s*(constant)\s*(fixup|\/\*fixup\*\/|)\s*(private|)\s*(\w+)\s+=\s+<([a-zA-Z0-9]+)\s+([a-zA-Z0-9])+>\s*(\/\*\s*(.*?)\s*\*\/|)(\r|\n|\s)/)
if(/^\s*(constant)\s*(fixup|\/\*fixup\*\/|)\s*(private|)\s*(\w+)\s+=\s+<([a-zA-Z0-9]+)\s+(["']?)([a-zA-Z0-9.:\\]+)\6>\s*(\/\*\s*(.*?)\s*\*\/|)(\r|\n|\s)/)
{
$name1 = $1;
$name2 = $2;
$name3 = $3;
$name4 = $4;
$name5 = $5;
$name6 = $7;
$name7 = $8;
printf DEST_XML_FILE "\t\t$name1";
printf DEST_XML_FILE "\t\t$name2";
printf DEST_XML_FILE "\t\t$name3";
printf DEST_XML_FILE "\t\t$name4";
printf DEST_XML_FILE "\t\t$name5";
printf DEST_XML_FILE "\t\t$name6";
printf DEST_XML_FILE "\t\t$name7\n";
# $x = 1;
}
}
}
使用以下解析代码:
Use the following parse code:
if(/^\s*(constant)\s*(fixup|\/\*fixup\*\/|)\s*(private|)\s*(\w+)\s+=\s+<([a-zA-Z0-9]+)\s+(["']?)([a-zA-Z0-9.:\\]+)\6>\s*(\/\*\s*(.*?)\s*\*\/|)(\r|\n|\s)/)
我添加了对单引号和双引号的处理.我使用反向引用进行报价匹配.我也更新了角色类的路径.也就是说,它现在包括冒号(:),点(.)和反斜杠()以及字母数字字符.
I have added handling of both the single and double quotes. I use back-reference for quotes matching. Also I have updated the character class for path. i.e. it now includes the colon(:), dot(.), and backslash() along with alpha-numeric characters.
这篇关于如何在Perl中提取和解析带引号的字符串?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!