XML中的外来字符 [英] Foreign Characters in XML

查看:50
本文介绍了XML中的外来字符的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

大家好,


几个星期前我发布了我认为的问题

文件系统读取重音字符然而,之后调试

逐行我现在发现了真正的问题。


我将一个文件列表存储在一个XML文件中作为一种数据库。 br />
其中一些文件名有重音符号(即áéíú

或?)。但是,在将文件名写入XML文件时,将删除

重音字符。这会导致重新读取文件名时出现问题,因为程序无法找到这些文件,因为他们的文件名已经不同了。\\ b
''已保存''文件名现在不同了。例如,单词más

在XML文件中保存为ms。


任何想法如何解决这个问题?我可以去除重音

并用它们的正常替换它们。等价即á成为a。但是

这是一种bodge修复,因为我将丢失原始

文件的链接。另外,我可以看到一个文件可能被覆盖的情况

,因为修改后的文件名可能与现有文件相同。


所以,放它是愚蠢的,我被困住了!帮助!

谢谢

Hi all,

I posted a couple of weeks ago with what I thought was a problem with
the file system reading accented characters however, after debugging
line by line I have now found the true problem.

I am storing a list of files in an XML file as a sort of database.
Some of these filenames have accented characters (i.e. á é í ó ú
or ?). However, upon writing the filename to the XML file, the
accented character is dropped. This causes a problem upon re-reading
the filenames because the program can not find the files because their
''saved'' filename is now different. For example, the word "más" is
saved in the XML file as "ms".

Any ideas how I can work around this? I could strip out the accents
and replace them with their "normal" equivalent i.e. á becomes a. But
this is a sort of bodge fix as I will lose the link to the original
file. Also, I can see a scenario where a file may get overwritten
because the modified filename is the same as an existing file perhaps.

So, to put it blunty, I''m stuck! Help!
Thanks

推荐答案

" Hugh Janus" <我的************* @ hotmail.com> schrieb:
"Hugh Janus" <my*************@hotmail.com> schrieb:
我将一个文件列表存储在一个XML文件中作为一种数据库。
其中一些文件名有重音字符(即áéóú
或?)。但是,在将文件名写入XML文件后,
重音字符将被删除。这会在重新读取文件名时出现问题,因为程序无法找到文件,因为它们的已保存文件名现在不同了。例如,单词más
以ms的形式保存在XML文件中。
I am storing a list of files in an XML file as a sort of database.
Some of these filenames have accented characters (i.e. á é í ó ú
or ?). However, upon writing the filename to the XML file, the
accented character is dropped. This causes a problem upon re-reading
the filenames because the program can not find the files because their
''saved'' filename is now different. For example, the word "más" is
saved in the XML file as "ms".



您目前如何将数据写入XML文件?你在哪个班级使用?b $ b?这个问题很可能是由于错误的编码导致数据持续存在。


-

MS Herfried K. Wagner

MVP< URL:http://dotnet.mvps.org/>

VB< URL:http://classicvb.org / petition />



How are you currently writing data to the XML file? Which classes are you
using? It''s likely that the problem is caused by a wrong encoding used to
persist the data.

--
M S Herfried K. Wagner
M V P <URL:http://dotnet.mvps.org/>
V B <URL:http://classicvb.org/petition/>


休,

我不知道你的代码是什么样的,但您可能需要标记化 (编码)这些字符。它们应该以UTF-8或Unicode格式存储或读入(XML处理器应该识别这些)。这应该正常工作。如果您使用.NET框架的System.Xml代码生成或读取XML文档 - 您不必做任何事情。您是否正在生成自己的XML,&自己解析它?如果是这样,您将需要自己进行编码,并且需要确保您创建的文件具有详细说明文本类型的相应标题 - 这听起来就像您在编写时将它们转换为裸ASCII他们出去了。您可以使用System.Test.UTF8Encoding类在正常和正常之间转换字符串。例如,字符串和UTF8。


让我们知道它是怎么回事。 (我会离开几天,但会在星期五回来的时候回来查看。)


- 马特格茨 - *

VB Compiler Dev Lead


-----原始消息-----

来自:Hugh Janus

发表于:2006年1月9日星期一上午11:41

发布到:microsoft.public.dotnet.languages.vb

对话:XML中的外来字符

主题:XML中的外来字符

大家好,


我几周前发布了我认为的问题

文件系统读取带重音的字符然而,经过调试后逐行b $ b我现在发现了真正的问题。


我正在存储一个XML文件中的文件列表作为一种数据库。

其中一些文件名具有重音字符(即= E1 = E9 = ED = F3 = FA

或= F1)。但是,在将文件名写入XML文件时,将删除

重音字符。这会导致重新读取文件名时出现问题,因为程序无法找到这些文件,因为他们的文件名已经不同了。\\ b
''已保存''文件名现在不同了。例如,单词m = E1s表示m = E1s。

在XML文件中保存为ms。


任何想法如何解决这个问题?我可以去除重音

并用它们的正常替换它们。等效即= E1变为a。但是

这是一种bodge修复,因为我将丢失原始

文件的链接。另外,我可以看到一个文件可能被覆盖的情况

,因为修改后的文件名可能与现有文件相同。


所以,放它是愚蠢的,我被困住了!帮助!

谢谢
Hi, Hugh,
I''m not sure what you''re code looks like, but you may need to "tokenize" (encode) these characters. They should be stored or read in as either UTF-8 or Unicode (XML processors are supposed to recognize these). This should "just work" if you are using the .NET framework''s System.Xml code to generate or read an XML document -- you shouldn''t have to do anything. Are you generating your own XML instead, & parsing it on your own? If so, you will need to do the encoding yourself, and will need to make sure the file you create has the appropriate header detailing the text type -- it sounds like you''re translating them to bare ASCII when you''re writing them out. You can use the System.Test.UTF8Encoding class to translate strings between "normal" strings and UTF8, for example.

Let us know how it goes. (I''ll be away for a few days, but will check back when I get back Friday.)

--Matt Gertz--*
VB Compiler Dev Lead

-----Original Message-----
From: Hugh Janus
Posted At: Monday, January 09, 2006 11:41 AM
Posted To: microsoft.public.dotnet.languages.vb
Conversation: Foreign Characters in XML
Subject: Foreign Characters in XML
Hi all,

I posted a couple of weeks ago with what I thought was a problem with
the file system reading accented characters however, after debugging
line by line I have now found the true problem.

I am storing a list of files in an XML file as a sort of database.
Some of these filenames have accented characters (i.e. =E1 =E9 =ED =F3 =FA
or =F1). However, upon writing the filename to the XML file, the
accented character is dropped. This causes a problem upon re-reading
the filenames because the program can not find the files because their
''saved'' filename is now different. For example, the word "m=E1s" is
saved in the XML file as "ms".

Any ideas how I can work around this? I could strip out the accents
and replace them with their "normal" equivalent i.e. =E1 becomes a. But
this is a sort of bodge fix as I will lose the link to the original
file. Also, I can see a scenario where a file may get overwritten
because the modified filename is the same as an existing file perhaps.

So, to put it blunty, I''m stuck! Help!
Thanks


Herfried K. Wagner [MVP]写道:
Herfried K. Wagner [MVP] wrote:
你现在如何写数据到XML文件?你使用哪些课程?问题可能是由于用于持久保存数据的错误编码引起的。
How are you currently writing data to the XML file? Which classes are you
using? It''s likely that the problem is caused by a wrong encoding used to
persist the data.




我使用的是StreamReader和StreamWriter类。我可以指定

加入这些以获得重音字符吗?



I am using the class StreamReader and StreamWriter. Can I specify the
enconding with these in order to have the accented characters?


这篇关于XML中的外来字符的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆