unicode的问题 [英] problem with unicode
问题描述
大家好
我从.txt文件中带了一个字符串,保存得像utf-8。
在.txt文件中我有这个字符串fr ?? d ?? ric。我的问题是,当我用b $ b读取这个文件.txt时,字节数是多少?是这样的:101,180。
??在utf-8 file中长度为2。我可以将这2长度更改为1.
我想要使用的问题?喜欢233字节而不喜欢2字节101,180
请帮助我
我可以在阅读.txt文件后更正此问题
这样写了tvin,
大家好我
我带来了.txt文件中的一个字符串,保存得像utf-8。
在.txt文件中我有这个字符串frédéric。我的问题是,当
我读取这个文件时.txt,é的字节是这样的:101,180。
é长度为2,在utf-8文件中。我可以将这2长度改为1.
那个'UTF-8是如何工作的。
我的问题是我想使用é喜欢233字节而不是2字节
101,180
然后使用像Windows-1252或ISO-8859-1或-15这样的8位编码。
但是为什么你想要*一个*字节呢?它不再是1976年,而且这个星球上没有8位b b位编码具有与Unicode相似的覆盖范围。顺便说一句,BCL中处理文本文件的所有功能
默认使用UTF-8。
-
Joerg Jooss
ne ******** @ joergjooss.de
Joerg
" Joerg Jooss"写道:
因此写了tvin,
大家好我的
我从.txt文件中带了一个字符串像utf-8一样保存。
在.txt文件中我有这个字符串fr ?? d ?? ric。我的问题是当
我读取这个文件时.txt ,字节??是这样的:101,180。
??在utf-8 file中长度为2。我可以将这2长度更改为1.
这就是UTF-8的工作方式。
我想用的问题?像233字节而不是2字节
101,180
然后使用8位编码,如Windows-1252或ISO-8859-1或-15。
<但是,为什么你想要*一个*字节呢?它不再是1976年了,这个星球上的8位编码没有与Unicode相似的覆盖范围。顺便说一下,BCL处理文本文件的所有功能都默认使用UTF-8。
干杯,
-
Joerg Jooss
ne ******** @ joergjooss.de
joerg
i可以使用2bytes,3byte ...我没有问题
但是fr ?? d ?? ric的长度是len( fr ?? d ?? ric)= 10,
fr ?? d ?? ric的长度应该是8才能在sql数据库中正确插入。
请帮帮我jeorg
这样写了tvin,
我可以使用2bytes,3byte ...我不是有问题
但是frédéric的长度是len(frédéric)= 10,
frédéric的长度应该是8才能在sql数据库中正确插入
。请帮我jeorg
你会混淆字节和字符。 Frédéric有10个字符,但它可能有10个或更多字节,具体取决于使用的字符编码
- 如果你使用的是UTF-32,它将是一个高达40 ;-)
您的数据库是否支持Unicode字符的nvarchar类型?
-
Joerg Jooss
ne********@joergjooss.de
Hi all
I brought a string from a .txt file which was saved like utf-8.
In the .txt file i have this string "fr??d??ric".My problem is that when i
read this file .txt,the bytes of ?? are like this : 101,180.
?? length are 2 in utf-8 file.how can i change this 2 length to 1.
my problem that i want to use ?? like 233 byte and not like 2 bytes 101,180
please help me
can i correct this problem after read the .txt file
Thus wrote tvin,
Hi all
I brought a string from a .txt file which was saved like utf-8.
In the .txt file i have this string "frédéric".My problem is that when
i
read this file .txt,the bytes of é are like this : 101,180.
é length are 2 in utf-8 file.how can i change this 2 length to 1.
That''s how UTF-8 works.
my problem that i want to use é like 233 byte and not like 2 bytes
101,180
Then use an 8 bit encoding like Windows-1252 or ISO-8859-1 or -15.
But why on earth do you want *one* byte? It''s not 1976 anymore, and no 8
bit encoding on this planet has a similar coverage as Unicode. BTW, all functionality
in the BCL to process text files uses UTF-8 by default.
Cheers,
--
Joerg Jooss
ne********@joergjooss.de
Joerg
"Joerg Jooss" wrote:
Thus wrote tvin,Hi all
I brought a string from a .txt file which was saved like utf-8.
In the .txt file i have this string "fr??d??ric".My problem is that when
i
read this file .txt,the bytes of ?? are like this : 101,180.
?? length are 2 in utf-8 file.how can i change this 2 length to 1.
That''s how UTF-8 works.my problem that i want to use ?? like 233 byte and not like 2 bytes
101,180
Then use an 8 bit encoding like Windows-1252 or ISO-8859-1 or -15.
But why on earth do you want *one* byte? It''s not 1976 anymore, and no 8
bit encoding on this planet has a similar coverage as Unicode. BTW, all functionality
in the BCL to process text files uses UTF-8 by default.
Cheers,
--
Joerg Jooss
ne********@joergjooss.de
joerg
i can use 2bytes ,3byte...i don''t have a problem
but the lenght of fr??d??ric is len(fr??d??ric) =10,
the lenght of fr??d??ric should be 8 to insert correctly in sql database .
please help me jeorg
Thus wrote tvin,
i can use 2bytes ,3byte...i don''t have a problem
but the lenght of frédéric is len(frédéric) =10,
the lenght of frédéric should be 8 to insert correctly in sql database
. please help me jeorg
You''re confusing bytes and characters. Frédéric has 10 characters, but it
may have 10 or more bytes depending on the character encoding being used
-- if you were using UTF-32, it would be a whopping 40 ;-)
Doesn''t your database support the nvarchar type for Unicode characters?
--
Joerg Jooss
ne********@joergjooss.de
这篇关于unicode的问题的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!