使用没有实体但是UTF-8的ImageMagick提取IPTC [英] Extract IPTC using ImageMagick without Entities but UTF-8

查看：176 发布时间：2018/7/30 14:09:48 java utf-8 imagemagick iptc

本文介绍了使用没有实体但是UTF-8的ImageMagick提取IPTC的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有一个包含ITPC数据的图像，并使用以下命令将IPTC提取为文本数据：

I have an image containing ITPC data and use the following command to extract the IPTC as textual data:

convert image.jpg IPTCTEXT:iptc.txt

问题是这似乎是使用特殊字符的实体：

The problem is that this seems to be using entities for "special characters":

2#120#Caption="Beschreibung f&#195;&#188;r den Import aus IPTC"

实际上它应该是für。但不是获得正确的实体&＃252;对于ü字符，我得到两个实体（可能两个字节的UTF-8编码字符都被转换为entites分隔）。这两个entites我无法正确解析。

Actually it should be "für" here. But instead of getting the correct entity ü for the "ü" character i get two entities (probably both bytes of the UTF-8 encoded character got transformed to entites separated). And these two entites i cannot parse correctly.

有没有办法获得正确的实体或禁用完全返回UTF-8字符的实体？

Is there any way to get the correct entity or disable the entities completely returning UTF-8 characters?

编辑：
我尝试使用Java中的StringEscapeUtils.unescapeXml解析实体但我得到两个字符（¼）而不是ü，因为两个实体都是非转义分开的。

I tried parsing the entities using StringEscapeUtils.unescapeXml in Java but i get two characters ("Ã¼") instead of the "ü" as both entities are unescaped separated.

Edit2：
这里的示例图片： http://fs1.directupload.net/images/150615/5eiv6wwf.jpg

使用没有实体但是UTF-8的ImageMagick提取IPTC [英] Extract IPTC using ImageMagick without Entities but UTF-8

问题描述

推荐答案

相关文章

Java开发最新文章

热门教程

热门工具

登录关闭

使用没有实体但是UTF-8的ImageMagick提取IPTC [英] Extract IPTC using ImageMagick without Entities but UTF-8

问题描述

推荐答案

相关文章

Java开发最新文章

热门教程

热门工具

登录 关闭

登录关闭