从外部文件自动替换表 [英] Automating replacing tables from external files

查看:26
本文介绍了从外部文件自动替换表的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试用外部 XML 文件替换大型(约 300 MB)XML 文件中的多个表.

I'm trying to replace multiple tables from a large (~300 MB) XML file with external XML files.

大约有 30,000 个表,其中有 23,000 个 XML 文件,因为有些表保持不变.

There are roughly 30,000 tables, and there are 23,000 XML files because some tables are left unchanged.

例如,如果我有:

<?xml version="1.0" encoding="UTF-8"?>
<INI>
   <TABLE name="People">
      <ROW>
         <ID>1</ID>
         <Name><![CDATA[Bob]]></Name>
      </ROW>
   </TABLE>
   <TABLE name="Animals">
      <ROW>
         <ID>1</ID>
         <Name><![CDATA[Golden]]></Name>
      </ROW>
   </TABLE>
</INI>

我会有名为 People.xmlAnimals.xml 的文件应该被替换.

I would have files called People.xml and Animals.xml that should be replaced.

如果 People.xml 是:

   <TABLE name="People">
      <ROW>
         <ID>1</ID>
         <Name><![CDATA[Mary]]></Name>
      </ROW>
      <ROW>
         <ID>2</ID>
         <Name><![CDATA[Bob]]></Name>
      </ROW>
      <ROW>
         <ID>3</ID>
         <Name><![CDATA[Dan]]></Name>
      </ROW>
   </TABLE>

那么主要的大型 XML 文件将变成:

then the main large XML file would become:

<?xml version="1.0" encoding="UTF-8"?>
<INI>
   <TABLE name="People">
      <ROW>
         <ID>1</ID>
         <Name><![CDATA[Mary]]></Name>
      </ROW>
      <ROW>
         <ID>2</ID>
         <Name><![CDATA[Bob]]></Name>
      </ROW>
      <ROW>
         <ID>3</ID>
         <Name><![CDATA[Dan]]></Name>
      </ROW>
   </TABLE>
   <TABLE name="Animals">
      <ROW>
         <ID>1</ID>
         <Name><![CDATA[Golden]]></Name>
      </ROW>
   </TABLE>
</INI>

然后对于 Animals.xml 也是如此.

and then the same for Animals.xml.

我曾尝试研究 String.Split(),但我找不到这样做的方法.

I've tried looking into String.Split(), but I couldn't find a way to do it like that.

感谢任何帮助.提前致谢!

Any help is appreciated. Thanks in advance!

推荐答案

你可以做的是将 XmlReader 从 Mark 流式传输到 XmlWriter 的基本逻辑Fussell 的文章 结合 XmlReader 和 XmlWriter 类进行简单的流转换 将一个 XML 文件的内容修补到另一个:

What you can do is to take the basic logic of streaming an XmlReader to an XmlWriter from Mark Fussell's article Combining the XmlReader and XmlWriter classes for simple streaming transformations to patch the contents of one XML file into another:

public abstract class XmlStreamingEditorBase
{
    readonly XmlReader reader;
    readonly XmlWriter writer;
    readonly Predicate<XmlReader> shouldTransform;

    public XmlStreamingEditorBase(XmlReader reader, XmlWriter writer, Predicate<XmlReader> shouldTransform)
    {
        this.reader = reader;
        this.writer = writer;
        this.shouldTransform = shouldTransform;
    }

    protected XmlReader Reader { get { return reader; } }

    protected XmlWriter Writer { get { return writer; } }

    public void Process()
    {
        while (Reader.Read())
        {
            if (Reader.NodeType == XmlNodeType.Element)
            {
                if (shouldTransform(Reader))
                {
                    EditCurrentElement();
                    continue;
                }
            }
            Writer.WriteShallowNode(Reader);
        }
    }

    protected abstract void EditCurrentElement();
}

public class XmlStreamingEditor : XmlStreamingEditorBase
{
    readonly Action<XmlReader, XmlWriter> transform;

    public XmlStreamingEditor(XmlReader reader, XmlWriter writer, Predicate<XmlReader> shouldTransform, Action<XmlReader, XmlWriter> transform)
        : base(reader, writer, shouldTransform)
    {
        this.transform = transform;
    }

    protected override void EditCurrentElement()
    {
        using (var subReader = Reader.ReadSubtree())
        {
            transform(subReader, Writer);
        }
    }
}

public class XmlStreamingPatcher
{
    readonly XmlReader patchReader;
    readonly XmlReader reader;
    readonly XmlWriter writer;
    readonly Predicate<XmlReader> shouldPatchFrom;
    readonly Func<XmlReader, XmlReader, bool> shouldPatchFromTo;
    bool patched = false;

    public XmlStreamingPatcher(XmlReader reader, XmlWriter writer, XmlReader patchReader, Predicate<XmlReader> shouldPatchFrom, Func<XmlReader, XmlReader, bool> shouldPatchFromTo)
    {
        if (reader == null || writer == null || patchReader == null || shouldPatchFrom == null || shouldPatchFromTo == null)
            throw new ArgumentNullException();
        this.reader = reader;
        this.writer = writer;
        this.patchReader = patchReader;
        this.shouldPatchFrom = shouldPatchFrom;
        this.shouldPatchFromTo = shouldPatchFromTo;
    }

    public bool Process()
    {
        patched = false;
        while (patchReader.Read())
        {
            if (patchReader.NodeType == XmlNodeType.Element)
            {
                if (shouldPatchFrom(patchReader))
                {
                    var editor = new XmlStreamingEditor(reader, writer, ShouldPatchTo, PatchNode);
                    editor.Process();
                    return patched;
                }
            }
        }
        return false;
    }

    bool ShouldPatchTo(XmlReader reader)
    {
        return shouldPatchFromTo(patchReader, reader);
    }

    void PatchNode(XmlReader reader, XmlWriter writer)
    {
        using (var subReader = patchReader.ReadSubtree())
        {
            while (subReader.Read())
            {
                writer.WriteShallowNode(subReader);
                patched = true;
            }
        }
    }
}

public static class XmlReaderExtensions
{
    public static XName GetElementName(this XmlReader reader)
    {
        if (reader == null)
            return null;
        if (reader.NodeType != XmlNodeType.Element)
            return null;
        string localName = reader.Name;
        string uri = reader.NamespaceURI;
        return XName.Get(localName, uri);
    }
}

public static class XmlWriterExtensions
{
    public static void WriteShallowNode(this XmlWriter writer, XmlReader reader)
    {
        // adapted from http://blogs.msdn.com/b/mfussell/archive/2005/02/12/371546.aspx
        if (reader == null)
            throw new ArgumentNullException("reader");

        if (writer == null)
            throw new ArgumentNullException("writer");

        switch (reader.NodeType)
        {
            case XmlNodeType.Element:
                writer.WriteStartElement(reader.Prefix, reader.LocalName, reader.NamespaceURI);
                writer.WriteAttributes(reader, true);
                if (reader.IsEmptyElement)
                {
                    writer.WriteEndElement();
                }
                break;

            case XmlNodeType.Text:
                writer.WriteString(reader.Value);
                break;

            case XmlNodeType.Whitespace:
            case XmlNodeType.SignificantWhitespace:
                writer.WriteWhitespace(reader.Value);
                break;

            case XmlNodeType.CDATA:
                writer.WriteCData(reader.Value);
                break;

            case XmlNodeType.EntityReference:
                writer.WriteEntityRef(reader.Name);
                break;

            case XmlNodeType.XmlDeclaration:
            case XmlNodeType.ProcessingInstruction:
                writer.WriteProcessingInstruction(reader.Name, reader.Value);
                break;

            case XmlNodeType.DocumentType:
                writer.WriteDocType(reader.Name, reader.GetAttribute("PUBLIC"), reader.GetAttribute("SYSTEM"), reader.Value);
                break;

            case XmlNodeType.Comment:
                writer.WriteComment(reader.Value);
                break;

            case XmlNodeType.EndElement:
                writer.WriteFullEndElement();
                break;

            default:
                Debug.WriteLine("unknown NodeType " + reader.NodeType);
                break;

        }
    }
}

要创建实例 XmlReaderXmlWriter 以从文件读取和写入 XML,请使用 XmlReader.Create(string)XmlWriter.Create(string).另外,请务必将大文件流式传输到临时文件中,并在编辑完成后才替换原始文件.

To create instances XmlReader and XmlWriter to read and write XML from files, use XmlReader.Create(string) and XmlWriter.Create(string). Also, be sure to stream the large file into a temporary file and only replace the original after editing is finished.

然后,测试:

public static class TestXmlStreamingPatcher
{
    public static void Test()
    {
        string mainXml = @"<?xml version=""1.0"" encoding=""UTF-8""?>
<INI>
   <TABLE name=""People"">
      <ROW>
         <ID>1</ID>
         <Name><![CDATA[Bob]]></Name>
      </ROW>
   </TABLE>
   <TABLE name=""Animals"">
      <ROW>
         <ID>1</ID>
         <Name><![CDATA[Golden]]></Name>
      </ROW>
   </TABLE>
</INI>
";
        string patchXml = @"<TABLE name=""People"">
      <ROW>
         <ID>1</ID>
         <Name><![CDATA[Mary]]></Name>
      </ROW>
      <ROW>
         <ID>2</ID>
         <Name><![CDATA[Bob]]></Name>
      </ROW>
      <ROW>
         <ID>3</ID>
         <Name><![CDATA[Dan]]></Name>
      </ROW>
   </TABLE>
";
        var patchedXml1 = TestPatch(mainXml, patchXml);
        Debug.WriteLine(patchedXml1);
    }

    private static string TestPatch(string mainXml, string patchXml)
    {
        using (var mainReader = new StringReader(mainXml))
        using (var mainXmlReader = XmlReader.Create(mainReader))
        using (var patchReader = new StringReader(patchXml))
        using (var patchXmlReader = XmlReader.Create(patchReader))
        using (var mainWriter = new StringWriter())
        {
            using (var mainXmlWriter = XmlWriter.Create(mainWriter))
            {
                var patcher = new XmlStreamingPatcher(mainXmlReader, mainXmlWriter, patchXmlReader, ShouldPatchFrom, ShouldPatchFromTo);
                patcher.Process();
            }
            return mainWriter.ToString();
        }
    }

    static bool ShouldPatchFrom(XmlReader reader)
    {
        return reader.GetElementName() == "TABLE";
    }

    static bool ShouldPatchFromTo(XmlReader patchReader, XmlReader toReader)
    {
        if (patchReader.GetElementName() != toReader.GetElementName())
            return false;
        string name = patchReader.GetAttribute("name");
        if (string.IsNullOrEmpty(name))
            return false;
        return name == toReader.GetAttribute("name");
    }
}

这个类的TestXmlStreamingPatcher.Test()的输出是

<?xml version="1.0" encoding="UTF-8"?>
<INI>
   <TABLE name="People">
      <ROW>
         <ID>1</ID>
         <Name><![CDATA[Mary]]></Name>
      </ROW>
      <ROW>
         <ID>2</ID>
         <Name><![CDATA[Bob]]></Name>
      </ROW>
      <ROW>
         <ID>3</ID>
         <Name><![CDATA[Dan]]></Name>
      </ROW>
   </TABLE>
   <TABLE name="Animals">
      <ROW>
         <ID>1</ID>
         <Name><![CDATA[Golden]]></Name>
      </ROW>
   </TABLE>
</INI>

这就是你想要的.

这篇关于从外部文件自动替换表的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆