查找和更改 word 中的超链接的 Powershell 脚本保存文档并创建新副本为 PDF [英] Powershell script that finds and changes hyperlinks in word saves doc and creates a new copy as PDF

查看:73
本文介绍了查找和更改 word 中的超链接的 Powershell 脚本保存文档并创建新副本为 PDF的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想弄清楚如何编写一个脚本,该脚本遍历文件夹并抓取文件夹中的所有 Word 文档以搜索超链接并将链接更改为另一个链接.然后保存该 Word 文档并创建另一个版本,将其转换为 pdf.

I am trying to figure out how to write a script that goes through a folder and grabs all word documents in the folder to search for a hyperlink and change the link to another link. Then to save that word document as well as create another version of it converting it to a pdf.

如何调整下面的脚本以获取文件夹中的所有 Word 文档,然后搜索所有超链接以查找https://www.yahoo.com" 到https://www.google.com".一些如何遍历整个文档搜索所有超链接.保存该文档,然后转换并提供新的 pdf.

How can I adjust the script below to grab all word documents in a folder then search all hyperlinks for "https://www.yahoo.com" to "https://www.google.com". Some how looping through the entire document searching through ALL hyperlinks. Saving that document and then converting and giving a new pdf.

这可能吗?

我目前所拥有的

    $word = New-Object -ComObject word.application
        $document = $word.documents.open("path to folder")
        $hyperlinks = @($document.Hyperlinks) 
        $hyperlinks | ForEach {
            $newURI = ([uri]$_.address).AbsoluteUri
            Write-Verbose ("Updating {0} to {1}" -f $_.Address,$newURI) -Verbose
            $_.address = $newURI
        }
        If (_$.address -eq "https://www.yahoo.com/") {
            $_.Address = "https://www.google.com/"
        } ElseIf ($_.Address -eq "http://def.com/motorists") {
            $_.Address = "http://hij.com/"
        }
        $document.save()
        $word.quit()

    Get-ChildItem -Path $document -Include *.doc, *.docx -Recurse |
        ForEach-Object {
            $doc = $word.Documents.Open($_.Fullname)
            $pdf = $_.FullName -replace $_.Extension, '.pdf'
            $doc.ExportAsFixedFormat($pdf,17)
            $doc.Close()
        }
    $word.Quit()

我是 powershell 的新手,请有人帮助我完成这些步骤.我听说 powershell 可能是完成此类事情的最佳和最强大的语言.

I am new to powershell will someone please help walk me through these steps. I hear powershell is probably the best and strongest language to get this sort of thing accomplished.

推荐答案

以前没有这样做过,所以很高兴能弄清楚.今天我们俩都要学习了!你非常接近.只需要一些调整和处理多个文件的循环.我相信会有更博学的人会加入,但这应该会让你得到想要的结果.

Hadn't done this before, so it was nice to figure it out. We both get to learn today! You were very close. Just needed a few adjustments and a loop for handling multiple files. I'm sure someone more knowledgeable will drop in but this should get you the desired result.

$NewDomain1 = "google"
$NewDomain2 = "hij"
$OurDocuments = Get-ChildItem -Path "C:\Apps\testing" -Filter "*.doc*" -Recurse

$Word = New-Object -ComObject word.application
$Word.Visible = $false

$OurDocuments | ForEach-Object {
    $Document = $Word.documents.open($_.FullName)
    "Processing file: {0}" -f $Document.FullName
    $Document.Hyperlinks | ForEach-Object {
        if ($_.Address -like "https://www.yahoo.com/*") {
            $NewAddress = $_.Address -Replace "yahoo","google"
            "Updating {0} to {1}" -f $_.Address,$NewAddress
            $_.Address = $_.TextToDisplay = $NewAddress
        } elseif ($_.Address -like "http://def.com/*") {
            $NewAddress = $_.Address -Replace "def","hij"
            "Updating {0} to {1}" -f $_.Address,$NewAddress
            $_.Address = $_.TextToDisplay = $NewAddress
        }
    }

    "Saving changes to {0}" -f $Document.Fullname
    $Document.Save()    

    $Pdf = $Document.FullName -replace $_.Extension, '.pdf'
    "Saving document {0} as PDF {1}" -f $Document.Fullname,$Pdf
    $Document.ExportAsFixedFormat($Pdf,17)

    "Completed processing {0} `r`n" -f $Document.Fullname
    $Document.Close()
}

$Word.Quit()

让我们来看看...

我们会首先将您的新地址移动到几个变量中,以便日后引用和更改.您还可以在此处添加您要查找的地址,根据需要替换硬编码字符串.第三行使用过滤器获取目录中的所有 .DOC 和 .DOCX 文件,我们将使用它们进行迭代.就我个人而言,我会小心使用 -Recurse 开关,因为您冒着对目录结构中更深层次的文件进行意外更改的风险.

We'll first move your new addresses into a couple of variables for ease of referencing and changing in the future. You can also add the addresses that you're looking for here, replacing the hard-coded strings as needed. The third line uses a filter to grab all .DOC and .DOCX files in the directory, which we'll use to iterate over. Personally, I would be careful using the -Recurse switch, as you run the risk of making unintended changes to a file deeper in the directory structure.

$NewAddress1 = "https://www.google.com/"
$NewAddress2 = "http://hij.com/"
$OurDocuments = Get-ChildItem -Path "C:\Apps\testing" -Filter "*.doc*" -Recurse

实例化我们的 Word Com 对象并将其隐藏起来.

Instantiate our Word Com Object and keep it hidden from view.

$Word = New-Object -ComObject word.application
$Word.Visible = $false

进入我们的 ForEach-Object 循环...

Stepping into our ForEach-Object loop...

对于我们在 $OurDocuments 中收集的每个文档,我们打开它并将所有超链接通过管道传输到另一个 ForEach-Object,在那里我们检查 的值地址 属性.如果有我们想要的匹配项,我们用新值更新属性.您会注意到我们还更新了 TextToDisplay 属性.这是您在文档中看到的文本,而不是 Address,后者控制超链接的实际位置.

For each document that we gathered in $OurDocuments, we open it and pipe any hyperlinks into another ForEach-Object, where we check the value of the Address property. If there's a match that we want, we update the property with the new value. You'll notice that we're also updating the TextToDisplay property. This is the text that you see in the document, as opposed to Address which controls where the hyperlink actually goes.

这个... $_.Address = $_.TextToDisplay = $NewAddress1 ...是多变量赋值的一个例子.由于 AddressTextToDisplay 将设置为相同的值,我们将同时分配它们.

This... $_.Address = $_.TextToDisplay = $NewAddress1 ...is an example of multi-variable assignment. Since Address and TextToDisplay will be set to the same value, we'll assign them at the same time.

$Document = $Word.documents.open($_.FullName)
"Processing file: {0}" -f $Document.FullName
$Document.Hyperlinks | ForEach-Object {
    if ($_.Address -like "https://www.yahoo.com/*") {
        $NewAddress = $_.Address -Replace "yahoo","google"
        "Updating {0} to {1}" -f $_.Address,$NewAddress
        $_.Address = $_.TextToDisplay = $NewAddress
    } elseif ($_.Address -like "http://def.com/*") {
        $NewAddress = $_.Address -Replace "def","hij"
        "Updating {0} to {1}" -f $_.Address,$NewAddress
        $_.Address = $_.TextToDisplay = $NewAddress
    }
}

保存所做的任何更改...

Save any changes made...

"Saving changes to {0}" -f $Document.Fullname
$Document.Save()    

这里我们为另存为 PDF 时创建新文件名.注意第一行中的 $_.Extension.我们切换到使用管道对象来引用文件扩展名,因为当前管道对象仍然是我们的 Get-ChildItem 中的文件信息对象.由于 $Document 对象没有扩展属性,您必须对文件名进行一些切片才能获得相同的结果.

Here we create the new filename for when we save as a PDF. Notice $_.Extension in our first line. We switch to using the pipeline object for referencing the file extension since the current pipeline object is still the file info object from our Get-ChildItem. Since the $Document object doesn't have an extension property, you'd have to do some slicing of the file name to achieve the same result.

$Pdf = $Document.FullName -replace $_.Extension, '.pdf'
"Saving document {0} as PDF {1}" -f $Document.Fullname,$Pdf
$Document.ExportAsFixedFormat($Pdf,17)

关闭文档,循环将移动到 $OurDocuments 中的下一个文件.

Close the document up and the loop will move to the next file in $OurDocuments.

"Completed processing {0} `r`n" -f $Document.Fullname
$Document.Close()

浏览完所有文档后,我们关闭 Word.

Once we run through all documents, we close Word.

$Word.Quit()

我希望一切都有意义!

这篇关于查找和更改 word 中的超链接的 Powershell 脚本保存文档并创建新副本为 PDF的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆