如何使用 powershell 将重复的 xml 节点转换为逗号分隔的字符串 [英] How do I convert repeated xml nodes into a comma delimited string using powershell

查看:41
本文介绍了如何使用 powershell 将重复的 xml 节点转换为逗号分隔的字符串的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有大约 13000 个以 XML 格式格式化的日志文件,我需要将它们全部转换为电子表格\csv 文件.

I have some 13000 log files that are formatted in XML format and I need to convert all of them into a spreadsheet\csv file.

如您所见,我不是程序员,但我已经尝试过了.
我已经编写了一个 powershell 脚本来获取第一个节点并创建一个逗号分隔的字符串,但我坚持获取最后一个节点,该节点可以包含从没有条目到数十个条目的任何内容.

As you will see I'm not programmer but I've tried.
I have written a powershell script to get the first nodes out and create a comma delimited string but I am stuck with getting the last node which can contain anything from no entries to dozens.

xml 文件示例:

<?xml version="1.0" encoding="utf-8"?>
<MigrationUserStatus>
  <User>username@domain.com</User>
  <StoreList>
    <EmailMigrationStatus>
      <MigrationStatus value="Success" />
      <FolderList>
        <TotalCount value="6" />
        <SuccessCount value="3" />
        <FailCount value="3" />
        <FailedMessages>
          <ErrorMessage>GDSTATUS_BAD_REQUEST:Permanent failure: BadAttachment</ErrorMessage>
          <SentTime>1601-01-01T00:00:00.000Z</SentTime>
          <ReceiveTime>1601-01-01T00:00:00.000Z</ReceiveTime>
        </FailedMessages>
        <FailedMessages>
          <ErrorMessage>GDSTATUS_BAD_REQUEST:Permanent failure: BadAttachment</ErrorMessage>
          <SentTime>1601-01-01T00:00:00.000Z</SentTime>
          <ReceiveTime>1601-01-01T00:00:00.000Z</ReceiveTime>
        </FailedMessages>
        <FailedMessages>
          <MessageSubject>Hey</MessageSubject>
          <ErrorMessage>GDSTATUS_BAD_REQUEST:Permanent failure: BadAttachment</ErrorMessage>
          <SentTime>2013-01-07T02:51:17.000Z</SentTime>
          <ReceiveTime>2013-01-07T02:51:17.000Z</ReceiveTime>
          <MessageSize value="2881" />
        </FailedMessages>
        <StartTime>2013-01-07T01:52:46.000Z</StartTime>
        <EndTime>2013-01-07T04:41:59.000Z</EndTime>
      </FolderList>
      <StartTime>2013-01-07T01:52:43.000Z</StartTime>
      <EndTime>2013-01-07T04:41:59.000Z</EndTime>
    </EmailMigrationStatus>
    <StartTime>2013-01-07T01:52:43.000Z</StartTime>
    <EndTime>2013-01-07T04:41:59.000Z</EndTime>
  </StoreList>
</MigrationUserStatus>

使用此代码,我可以轻松获得创建的 csv 行的第一部分:

With this code I can easily get the the first parts of the csv line created:

$folder = "C:\temp"
$outfile = = [IO.File]::OpenWrite("alluserslogs.csv")
$csv = "User,Total Emails, Successful emails,Failed emails,Failures`r`n"

dir Status-*.log | foreach ( $_) {
[xml]$Status = Get-Content $_
$csvpt1 +=$Status.MigrationUserStatus.User + "," + $Status.MigrationUserStatus.StoreList.EmailMigrationStatus.FolderList.TotalCount.value + "," + $Status.MigrationUserStatus.StoreList.EmailMigrationStatus.FolderList.SuccessCount.value + "," + $Status.MigrationUserStatus.StoreList.EmailMigrationStatus.FolderList.FailCount.value

接下来是我要解脱的地方.我想读取每个 FailedMessages 节点并将其构建为另一个逗号分隔的字符串

The next bit is where I'm coming unstuck. I want to read each FailedMessages node and build it to another comma delimited string

foreach ($FMessage in $Status.MigrationUserStatus.StoreList.EmailMigrationStatus.FolderList.FailedMessages) {
$csvpt2 +=$FMessage + ","
}

所需的输出:

GDSTATUS_BAD_REQUEST:Permanent failu... 1601-01-01T00:00:00.000Z                1601-01-01T00:00:00.000Z,GDSTATUS_BAD_REQUEST:Permanent failu... 1601-01-01T00:00:00.000Z                1601-01-01T00:00:00.000Z,.......

我在 $FMessage 中得到空白或方法调用失败,因为最后有 + "," 所以我需要修复这个问题.

I get either blank in $FMessage or Method invocation failed because of the + "," at the end so I need this fixed.

然后我将连接成一个最后的字符串并写入文件

then I'll concatenate into one final string and write to file

$csv +=$csvpt1 + "," + $csvpt2
$outfile.WriteLine($csv)
}
$outfile.Close()

在添加的愿望清单中,如果能够为最大数量的 FailedMessages 节点描述的 n 个列创建 csv 文件列标题失败也很棒.

In an added wish list it would also be great to be able to create the csv file columns header Failures for n number of columns as depicted by the largest number of FailedMessages nodes.

非常感谢您的帮助.

推荐答案

Powershell 具有对 XML 的原生支持,也许这可以帮助您入门?

Powershell has native support for XML, maybe this will help get you started?

它还具有带有 Export-Csv 的本机 CSV 导出器 :)

It also has a native CSV Exporter with Export-Csv :)

[xml]$XMLfile = gc C:\Temp\migration.xml

$MasterArray = @()
$MasterArray = "" | Select User, Result, TotalEmails, SuccessfulEmails, FailedEmails, Failures

$MasterArray.User = $XMLfile.MigrationUserStatus.user
$MasterArray.Result = $XMLfile.MigrationUserStatus.StoreList.EmailMigrationStatus.MigrationStatus.value
$MasterArray.TotalEmails = $XMLfile.MigrationUserStatus.StoreList.EmailMigrationStatus.FolderList.TotalCount.value
$MasterArray.SuccessfulEmails = $XMLfile.MigrationUserStatus.StoreList.EmailMigrationStatus.FolderList.SuccessCount.value
$MasterArray.FailedEmails = $XMLfile.MigrationUserStatus.StoreList.EmailMigrationStatus.FolderList.FailCount.value

$Failures = $XMLfile.MigrationUserStatus.StoreList.EmailMigrationStatus.FolderList.FailedMessages
$ConcatFailures = @()
foreach ($Failure in $Failures)
{
    $ConcatFailures += $Failure.ErrorMessage + "," + $Failure.SentTime + "," + $Failure.ReceivedTime
}

$MasterArray.Failures = $ConcatFailures -Join "|"
$MasterArray
$MasterArray | Export-Csv -NoType "C:\Temp\export.csv"

对于其他字段,您可以检查它们是否存在并添加它们是否很容易,这应该工作:

For the other fields, you can check if they exist and add them if they do pretty easily, this should work:

foreach ($Failure in $Failures)
{
    if ($Failure.ErrorMessage) { $ConcatFailures += $Failure.ErrorMessage }
    if ($Failure.SentTime) { $ConcatFailures += $Failure.ErrorMessage }
    if ($Failure.ReceivedTime) { $ConcatFailures += $Failure.ReceivedTime }
    if ($Failure.MessageSubject) { $ConcatFailures += $Failure.MessageSubject }
    if ($Failure.MessageSize) { $ConcatFailures += $Failure.MessageSize }
}

要处理要处理的 xml 文件,您需要添加一个外部循环来遍历所有 xml 文件,然后将数据附加到您随时构建的数组中.这应该做你想做的,对使用的路径进行一些调整:

To handle the xml files you want to add an outer loop to go through all of the xml files, and then append the data into an array that you build up as you go. This should do what you want, with some adjustments to the paths used:

$XMLFiles = gci "C:\Temp\" -Filter "*.xml"
$MasterArray = @()

foreach ($XMLFile in $XMLFiles)
{
    [xml]$XMLfile = gc $XMLFile.FullName

    $TempArray = @()
    $TempArray = "" | Select User, Result, TotalEmails, SuccessfulEmails, FailedEmails, Failures

    $TempArray.User = $XMLfile.MigrationUserStatus.user
    $TempArray.Result = $XMLfile.MigrationUserStatus.StoreList.EmailMigrationStatus.MigrationStatus.value
    $TempArray.TotalEmails = $XMLfile.MigrationUserStatus.StoreList.EmailMigrationStatus.FolderList.TotalCount.value
    $TempArray.SuccessfulEmails = $XMLfile.MigrationUserStatus.StoreList.EmailMigrationStatus.FolderList.SuccessCount.value
    $TempArray.FailedEmails = $XMLfile.MigrationUserStatus.StoreList.EmailMigrationStatus.FolderList.FailCount.value

    $Failures = $XMLfile.MigrationUserStatus.StoreList.EmailMigrationStatus.FolderList.FailedMessages
    $ConcatFailures = @()

    foreach ($Failure in $Failures)
    {
        if ($Failure.ErrorMessage) { $ConcatFailures += $Failure.ErrorMessage }
        if ($Failure.SentTime) { $ConcatFailures += $Failure.ErrorMessage }
        if ($Failure.ReceivedTime) { $ConcatFailures += $Failure.ReceivedTime }
        if ($Failure.MessageSubject) { $ConcatFailures += $Failure.MessageSubject }
        if ($Failure.MessageSize) { $ConcatFailures += $Failure.MessageSize }
    }
    $TempArray.Failures = $ConcatFailures -Join "|"

    $MasterArray += $TempArray
}

$MasterArray
$MasterArray | Export-Csv -NoType "C:\Temp\export.csv"

这篇关于如何使用 powershell 将重复的 xml 节点转换为逗号分隔的字符串的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆