使用Powershell从目录中的许多.doc *文件中提取文本? [英] Extracting text from many .doc* files in a directory with Powershell?
问题描述
嗨!
这似乎是从我在其他论坛中读到的Word文件中提取文本的最佳方式:
//
$ wd = New-Object -com word.Application
$ doc = $ wd.Documents。打开(&c; \file.doc")
$ doc.Range()。text
$ doc.Range()。段落| foreach {$ _.range.text}
//
但是如何从一个文档中提取文本,如何遍历所有文档在目录中的Word文档,提取文本,然后将多个文件中的文本合并到另一个目录中的新.txt / .csv文件中?我是脚本专家,
所以我很抱歉,如果这是天真的!
试一试:
inFolder =" c:\ test"
outFile =" ; C:\out.txt"
Hi!
This seems to be the best way to extract text from Word files from what I've read in other forums:
//
$wd = New-Object -com word.Application
$doc= $wd.Documents.Open("c:\file.doc")
$doc.Range().text
$doc.Range().paragraphs | foreach {$_.range.text}
//
But instead of extracting the text from one document, how do I iterate through all the Word documents in a directory, extract the text, and then combine the text from the multiple files into a new .txt/.csv file in a different directory? I am new to scripting, so my apologies if this is naive!
Give this a try:
inFolder="c:\test"
outFile="c:\out.txt"
这篇关于使用Powershell从目录中的许多.doc *文件中提取文本?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!