ISO-8859-1编码和二进制数据保存 [英] ISO-8859-1 encoding and binary data preservation

查看：153 发布时间：2016/11/19 12:44:19 java character-encoding iso-8859-1

本文介绍了ISO-8859-1编码和二进制数据保存的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我在留言中阅读到@Esailija对我的问题的回答

I read in a comment to an answer by @Esailija to a question of mine that

ISO-8859-1是唯一一个完全保留的编码具有确切字节< - >码点匹配的原始二进制数据

我还读入answer by @AaronDigulla：

I also read in this answer by @AaronDigulla that :

在Java中，ISO-8859- 1（aka ISO-Latin1）是1：1映射

In Java, ISO-8859-1 (a.k.a ISO-Latin1) is a 1:1 mapping

这将失败（如此处所示）：

// \u00F6 is ö
System.out.println(Arrays.toString("\u00F6".getBytes("utf-8")));
// prints [-61, -74]
System.out.println(Arrays.toString("\u00F6".getBytes("ISO-8859-1")));
// prints [-10]

问题

b $ b

Questions

我承认我不太明白 - 为什么它不能得到上面代码中的字节

？
<最重要的是，（字节保留行为 ISO-8859-1 ）指定 - 指向源的链接，或JSL会很好。是这个属性唯一的编码吗？

是与相关的ISO-8859-1 是默认默认值？

I admit I do not quite get it - why does it not get the bytes in the code above ?

Most importantly, where is this (byte preserving behavior of ISO-8859-1) specified - links to source, or JSL would be nice. Is it the only encoding with this property ?

Is it related to ISO-8859-1 being the default default ?

另请参见此问题适用于其他的好计数器示例charsets。

See also this question for nice counter examples from other charsets.

推荐答案

\\\ö不是字节数组。它是一个包含单个字符的字符串。执行以下测试：

"\u00F6" is not a byte array. It's a string containing a single char. Execute the following test instead:

public static void main(String[] args) throws Exception { byte[] b = new byte[] {(byte) 0x00, (byte) 0xf6}; String s = new String(b, "ISO-8859-1"); // decoding byte[] b2 = s.getBytes("ISO-8859-1"); // encoding System.out.println("Are the bytes equal : " + Arrays.equals(b, b2)); // true }

要检查这是否对任何字节，代码通过所有字节循环：

To check that this is true for any byte, just improve the code an loop through all the bytes:

public static void main(String[] args) throws Exception { byte[] b = new byte[256]; for (int i = 0; i < b.length; i++) { b[i] = (byte) i; } String s = new String(b, "ISO-8859-1"); byte[] b2 = s.getBytes("ISO-8859-1"); System.out.println("Are the bytes equal : " + Arrays.equals(b, b2)); }

ISO-8859-1是一种标准编码。所以使用的语言（Java，C＃或其他）并不重要。

ISO-8859-1 is a standard encoding. So the language used (Java, C# or whatever) doesn't matter.

这里有一个

Here's a Wikipedia reference that claims that every byte is covered:

在1992年，IANA注册了字符映射ISO_8859-1：1987，更常见的是ISO-8859-1的首选MIME名称（注意超过ISO 8859-1的额外连字符），这是ISO 8859-1的超集，用于Internet上。此映射将C0和C1控制字符分配给未分配的代码值，因此通过每个可能的8位值提供256个字符。

（强调我）

这篇关于ISO-8859-1编码和二进制数据保存的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

ISO-8859-1编码和二进制数据保存 [英] ISO-8859-1 encoding and binary data preservation

问题描述

问题

Questions

推荐答案

相关文章

Java开发最新文章

热门教程

热门工具

登录关闭

ISO-8859-1编码和二进制数据保存 [英] ISO-8859-1 encoding and binary data preservation

问题描述

问题

Questions

推荐答案

相关文章

Java开发最新文章

热门教程

热门工具

登录 关闭

登录关闭