XML数据提取 [英] XML Data extraction
问题描述
<Filer>
<ID>123456789</ID>
<Name>
<BusinessNameLine1>Stackoverflow</BusinessNameLine1>
</Name>
<NameControl>stack</NameControl>
<USAddress>
<AddressLine1>123 CHERRY HILL LANE</AddressLine1>
<City>LA</City>
<State>CA</State>
<ZIPCode>90210</ZIPCode>
</USAddress>
</Filer>
在这里,我必须给我的XML code的样本。与该XML我需要从这个XML掌握一定的属性。
Here I have a sample of xml code given to me. With this xml I need to grasp a certain attribute from this xml.
我只需要提取所有&LT; BusinessNameLine1&GT;
从文件。问题是,这个标签通过了文件中出现多次,但我只需要提取它,如果它假在&LT;文件管理器&GT;
标记。
I simply need to extract all the <BusinessNameLine1>
from the file. The issue is that this tag appears multiple times through out the file but I only need to extract it if it false in the <Filer>
Tag.
我会做到这一点用PHP,但我在工作,我不能够运行PHP code,由于未能在我的电脑上安装软件。不过,我可以执行的bash文件。该文件也是非常大的,所以我不能把它在Excel中。我不知道如何做到这一点。我想AP preciate从哪里开始一些帮助或指导。
I would do this with PHP but I am at work and I am not able to run php code due to not being able to install software on my computer. I can execute bash files however. The file is also extremely large so I can not put it in excel. I have no idea how to do this. I would appreciate some help or guidance on where to start.
推荐答案
您可以试试这个结合awk和sed命令,
You could try this combined awk and sed commands,
$ awk -v RS='</Filer>' '/^<Filer>/ {gsub (/\n/," "); print}' file | sed -r 's/.*<BusinessNameLine1>([^<]*)<\/BusinessNameLine1>.*/\1/g'
Stackoverflow
这篇关于XML数据提取的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!