如何在Perl中提取和解析带引号的字符串? [英] How do I extract and parse quoted strings in Perl?

查看:303
本文介绍了如何在Perl中提取和解析带引号的字符串?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

美好的一天.

我的文本文件内容如下. tmp.txt(非常大的文件)

My text file content below. tmp.txt (a very big size file)

constant fixup private AlarmFileName = <A "C:\\TMP\\ALARM.LOG">  /* A Format */

constant fixup ConfigAlarms = <U1 0>         /*  U1 Format  */

constant fixup ConfigEvents = <U2 0>         /*  U2 Format  */

下面的我的解析代码. 代码在这里无法处理C:\\TMP\\ALARM.LOG(引用的字符串). 我不知道如何替换代码"s +([a-zA-Z0-9])+>"来处理[a-zA-Z0-9](上面的0)字符串和旧字符串("C:\ TMP \ ALARM.LOG).

My parse code below. The code can't handle C:\\TMP\\ALARM.LOG (quoted string) here. I don't know how to replace the code "s+([a-zA-Z0-9])+>" to handle both the [a-zA-Z0-9] (0 above) string and the quated string ("C:\TMP\ALARM.LOG" above).

$source_file = "tmp.txt";
$dest_xml_file = "my.xml";

#Check existance of root directory
open(SOURCE_FILE, "$source_file") || die "Fail to open file $source_file";
open(DEST_XML_FILE, ">$dest_xml_file") || die "Coult not open output file $dest_xml_file";

$x = 0;

print DEST_XML_FILE  "<!-- from tmp.txt-->\n";
while (<SOURCE_FILE>) 
{
    &ConstantParseAndPrint;

}

sub ConstantParseAndPrint
{
 if ($x == 0)
 {

     if(/^\s*(constant)\s*(fixup|\/\*fixup\*\/|)\s*(private|)\s*(\w+)\s+=\s+<([a-zA-Z0-9]+)\s+([a-zA-Z0-9])+>\s*(\/\*\s*(.*?)\s*\*\/|)(\r|\n|\s)/)
                {
                    $name1 = $1;
                    $name2 = $2;
                    $name3 = $3;
                    $name4 = $4;
                    $name5 = $5;
                    $name6 = $6;
                    $name7 = $7;
                    printf DEST_XML_FILE "\t\t$name1";
                    printf DEST_XML_FILE "\t\t$name2";
                    printf DEST_XML_FILE "\t\t$name3";
                    printf DEST_XML_FILE "\t\t$name4";
                    printf DEST_XML_FILE "\t\t$name5";
                    printf DEST_XML_FILE "\t\t$name6";
                    printf DEST_XML_FILE "\t\t$name7";
                    $x = 1;
  }
 }
}

谢谢您的输入.

**你好,

感谢这么多出色的解决方案.我是新手,我想根据您的帖子做更多的研究.

Thanks for so many great solutions. I am a newbew, i would like to do more study based on your post.

非常感谢.**

推荐答案

#!/usr/bin/perl


$source_file = "tmp.txt";
$dest_xml_file = "my.xml";

#Check existance of root directory
open(SOURCE_FILE, "$source_file") || die "Fail to open file $source_file";
open(DEST_XML_FILE, ">$dest_xml_file") || die "Coult not open output file $dest_xml_file";

$x = 0;

print DEST_CS_FILE  "<!-- from tmp.txt-->\n";
while (<SOURCE_FILE>)   
{
    &ConstantParseAndPrint;

}

sub ConstantParseAndPrint
{
    if ($x == 0)
    {

#        if(/^\s*(constant)\s*(fixup|\/\*fixup\*\/|)\s*(private|)\s*(\w+)\s+=\s+<([a-zA-Z0-9]+)\s+([a-zA-Z0-9])+>\s*(\/\*\s*(.*?)\s*\*\/|)(\r|\n|\s)/)
        if(/^\s*(constant)\s*(fixup|\/\*fixup\*\/|)\s*(private|)\s*(\w+)\s+=\s+<([a-zA-Z0-9]+)\s+(["']?)([a-zA-Z0-9.:\\]+)\6>\s*(\/\*\s*(.*?)\s*\*\/|)(\r|\n|\s)/)

                {
                    $name1 = $1;
                    $name2 = $2;
                    $name3 = $3;
                    $name4 = $4;
                    $name5 = $5;
                    $name6 = $7;
                    $name7 = $8;
                    printf DEST_XML_FILE "\t\t$name1";
                    printf DEST_XML_FILE "\t\t$name2";
                    printf DEST_XML_FILE "\t\t$name3";
                    printf DEST_XML_FILE "\t\t$name4";
                    printf DEST_XML_FILE "\t\t$name5";
                    printf DEST_XML_FILE "\t\t$name6";
                    printf DEST_XML_FILE "\t\t$name7\n";
#                    $x = 1;
        }
    }
}




使用以下解析代码:




Use the following parse code:

if(/^\s*(constant)\s*(fixup|\/\*fixup\*\/|)\s*(private|)\s*(\w+)\s+=\s+<([a-zA-Z0-9]+)\s+(["']?)([a-zA-Z0-9.:\\]+)\6>\s*(\/\*\s*(.*?)\s*\*\/|)(\r|\n|\s)/) 

我添加了对单引号和双引号的处理.我使用反向引用进行报价匹配.我也更新了角色类的路径.也就是说,它现在包括冒号(:),点(.)和反斜杠()以及字母数字字符.

I have added handling of both the single and double quotes. I use back-reference for quotes matching. Also I have updated the character class for path. i.e. it now includes the colon(:), dot(.), and backslash() along with alpha-numeric characters.

这篇关于如何在Perl中提取和解析带引号的字符串?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆