如何使用 PowerShell 提取 Epub 元数据 (XML)? [英] How do you use PowerShell to extract Epub meta data (XML)?

查看:30
本文介绍了如何使用 PowerShell 提取 Epub 元数据 (XML)?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我对 PowerShell 并不陌生,但我很熟悉 XML 解析.基本上我想从 OPF 文件中提取标题、创建者和发布者信息,它只是一个 xml 文件.下面的书是来自 Google 的 epub v3 样本集的 Moby Dick.

I'm not new to PowerShell, but I am to XML parsing. Basically I want to extract the title, creator, and publisher information from the OPF file, which is just an xml file. The book below is Moby Dick from Google's epub v3 sample collection.

<?xml version="1.0" encoding="UTF-8"?>
<package xmlns="http://www.idpf.org/2007/opf" version="3.0" xml:lang="en" unique-identifier="pub-  id" prefix="cc: http://creativecommons.org/ns#">
    <metadata xmlns:dc="http://purl.org/dc/elements/1.1/">
        <dc:title id="title">Moby-Dick</dc:title>
        <meta refines="#title" property="title-type">main</meta>
        <dc:creator id="creator">Herman Melville</dc:creator>
        <meta refines="#creator" property="file-as">MELVILLE, HERMAN</meta>
        <meta refines="#creator" property="role" scheme="marc:relators">aut</meta>
        <dc:identifier id="pub-id">code.google.com.epub-samples.moby-dick-basic</dc:identifier>
        <dc:language>en-US</dc:language>
        <meta property="dcterms:modified">2012-01-18T12:47:00Z</meta>
        <dc:publisher>Harper &amp; Brothers, Publishers</dc:publisher>
        <dc:contributor id="contrib1">Dave Cramer</dc:contributor>
        <meta refines="#contrib1" property="role" scheme="marc:relators">mrk</meta>
        <dc:rights>This work is shared with the public using the Attribution-ShareAlike 3.0 Unported (CC BY-SA 3.0) license.</dc:rights>        
        <link rel="cc:license" href="http://creativecommons.org/licenses/by-sa/3.0/"/>
        <meta property="cc:attributionURL">http://code.google.com/p/epub-samples/</meta>
    </metadata>
</package>

我试过了:

[xml]$opf = gc path/to/package.opf
$opf.package.metdata

我只能通过这个而不是文本获取标签和属性信息.

I'm only able to get the tag and attribute information with this and not the text.

推荐答案

你需要像这样使用 #text 属性来获取一些值:

You need to use the #text property like this to get some of the values:

[xml] $opf = gc .\moby.opf

$title = $opf.package.metadata.title.'#text'
$creator = $opf.package.metadata.creator.'#text'
$publisher = $opf.package.metadata.publisher

Write-Host "$title written by $creator and published by $publisher"

这篇关于如何使用 PowerShell 提取 Epub 元数据 (XML)?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆