解析xml文件而不更改编码并保留文件格式 [英] parse an xml file without changing the encoding and preserving the file format
问题描述
原始xml文件使用 UTF-8编码,而没有BOM
<?xml version="1.0" encoding="UTF-8"?>
<some_text>
<ada/>
<file/>
<title><![CDATA[]]></title>
<code/>
<parathrhseis/>
</some_text>
我尝试在此功能中将文本设置为 title
:
I try to set text to title
in this function:
Dim myXmlDocument As XmlDocument = New XmlDocument()
Dim node As XmlNode
Dim s As String
s = "name.xml"
If System.IO.File.Exists(s) = False Then
Return False
End If
myXmlDocument.Load(s)
node = myXmlDocument.DocumentElement
Try
For Each node In node.ChildNodes
If node.Name = "title" Then
node.FirstChild.InnerText = "text"
Exit For
End If
Next
myXmlDocument.Save(s)
Catch e As Exception
MsgBox("Error in XmlParsing: " + e.Message)
Return False
End Try
Return True
文字正确书写,但编码更改为带BOM的UTF-8
,并且
添加空格:
The text is written correctly but the encoding changes to UTF-8 with BOM
and also it
adds spaces:
<?xml version="1.0" encoding="UTF-8"?>
<some_text>
<ada /> <- here
<file /> <- here
<title><![CDATA[text]]></title>
<code /> <- here
<parathrhseis /> <- here
</some_text>
我该如何解决这些问题
解决方案(在Bradley Uffner的帮助下)
SOLUTION (with the help of Bradley Uffner)
Dim fileReader As String
Try
fileReader = My.Computer.FileSystem.ReadAllText("original.xml")
fileReader = fileReader.Replace("<ada />", "<ada/>")
fileReader = fileReader.Replace("<file />", "<file/>")
fileReader = fileReader.Replace("<code />", "<code/>")
fileReader = fileReader.Replace("<parathrhseis />", "<parathrhseis/>")
File.WriteAllText("copy.xml", fileReader) <- File.WriteAllText automatically stores it without the BOM
Catch ex As Exception
MsgBox("Error: " + ex.Message)
Return
End Try
推荐答案
这实际上不是解析文件的问题,而是保存文件的问题.
This actually isn't a problem parsing the file, it's a problem saving it.
有关如何在不使用BOM的情况下保存xml的信息,请参见这篇文章. XDocument:将XML保存到不带BOM的文件中
See this post for how to save xml without a BOM. XDocument: saving XML to file without BOM
相关代码为:
Using writer = New XmlTextWriter(".\file.xml", New UTF8Encoding(False))
doc.Save(writer)
End Using
通常,您可以通过XmlTextWriter的 .Settings
属性控制文档的格式,但是我看不到用于控制自闭元素间距的属性.通过将输出保存到流中并手动删除"/>"之前的任何空格,在将输出保存到文件系统之前,可能会比较幸运.
Typically you can control the formatting of the document via the .Settings
property of the XmlTextWriter, but I don't see a property to control the spacing of self closing elements. You might have better luck post-processing the output before saving to the filesystem by saving it to a stream and manually removing any spaces before "/>".
这篇关于解析xml文件而不更改编码并保留文件格式的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!