使用grep,awk或sed等shell工具解析xml [英] Parse xml using shell tools like grep, awk or sed
问题描述
我有以下xml可以根据tag的值来解析和提取tag的值.仅在类型==托管"时提取.我想使用brep工具(例如grep,sed和awk)进行提取.我以前没有条件地提取单个标签值,而没有条件.我可以使用python或我知道的任何其他编程语言轻松完成它.但是,如果在shell脚本中完成,这将是理想的选择.
I have the following xml to parse and extract the value of tag based on the value of tag. Extract only if type == 'hosted'. I would like to extract using the bash tools like grep, sed and awk. Extracting single tag value with no condition is something I have done it before, not with conditionals. I can easily get it done using python or any other programming language i know. But this is would be ideal if done in the shell script.
...
<repositories-item>
<name>hosted-npm</name>
<type>hosted</type>
</repositories-item>
<repositories-item>
<name>proxied-npm</name>
<type>proxied</type>
</repositories-item>
...
推荐答案
xmlstarlet 是命令行XML工具包可以将复杂的XSLT模板表示为简短的命令行开关序列.
xmlstarlet is a command line XML Toolkit that can express complex XSLT templates as a short sequence of command line switches.
假设我们提供了格式正确的XML文档repos.xml
Suppose we are provided with a well-formed XML document repos.xml
<repositories>
<repositories-item>
<name>hosted-npm</name>
<type>hosted</type>
</repositories-item>
<repositories-item>
<name>proxied-npm</name>
<type>proxied</type>
</repositories-item>
</repositories>
如果使用以下开关通过XMLStarlet过滤器运行它
If you run it through an XMLStarlet filter with the following switches
$ cat repos.xml | xmlstarlet sel -t -m '//repositories-item' \
-i 'type="hosted"' -v 'name' -n
您将获得一行输出
hosted-npm
让我们看看XMLStarlet命令行.
Let's look at the XMLStarlet command line.
- 我们在
sel
开关指定的选择模式下运行命令 - 我们用
-t
开关指定选择模板 - 我们将解析器限制为使用
-m
swicth指定的//repositories-item
模板的<repositories-item>
元素 - 我们仅选择这些具有托管"元素的元素作为通过
-i
开关指定的type
元素的值 - 我们打印出
name
元素的值,该元素由-v
开关指定. - 在输出的每一行之后,我们打印一个用
-n
开关指定的换行符.
- We run the command in the Select mode specified with the
sel
switch - We specify the selection template with the
-t
switch - We restrict parser to
<repositories-item>
elements with the//repositories-item
template specified with the-m
swicth - We choose only these elements that have "hosted" as the value of
type
element specified with the-i
switch - We print out the value of the
name
element, specified with the-v
switch. - After each line of output we print a newline specified with the
-n
switch.
这是XMLStarlet生成的等效XSLT
Here is the equivalent XSLT generated by XMLStarlet
<?xml version="1.0"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:exslt="http://exslt.org/common" version="1.0" extension-element-prefixes="exslt">
<xsl:output omit-xml-declaration="yes" indent="no"/>
<xsl:template match="/">
<xsl:for-each select="//repositories-item">
<xsl:choose>
<xsl:when test="type="hosted"">
<xsl:call-template name="value-of-template">
<xsl:with-param name="select" select="name"/>
</xsl:call-template>
<xsl:value-of select="' '"/>
</xsl:when>
</xsl:choose>
</xsl:for-each>
</xsl:template>
<xsl:template name="value-of-template">
<xsl:param name="select"/>
<xsl:value-of select="$select"/>
<xsl:for-each select="exslt:node-set($select)[position()>1]">
<xsl:value-of select="' '"/>
<xsl:value-of select="."/>
</xsl:for-each>
</xsl:template>
</xsl:stylesheet>
根据Charles Duffy的建议,值得注意的是,可以使用-C
选项通过XMLStarlet生成此XSLT规范:
Per Charles Duffy suggestion it is worth noting that this XSLT specification can be generated with XMLStarlet using the -C
option:
xmlstarlet sel -C -t -m '//repositories-item' \
-i 'type="hosted"' -v 'name' -n > hosted-repos.xslt
此生成的XSLT规范可以直接与xsltproc
一起使用
This generated XSLT specification can be directly used with xsltproc
as
cat repos.xml | xsltproc hosted-repos.xslt -
这篇关于使用grep,awk或sed等shell工具解析xml的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!