XMLStarlet - UTF-8 北欧字符 [英] XMLStarlet - UTF-8 Nordic characters

查看:48
本文介绍了XMLStarlet - UTF-8 北欧字符的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

使用 XMLStarlet (windows) 编辑 RSS 提要,但在使用挪威语字符ÆØÅ"时遇到了一些问题.

Using XMLStarlet (windows) to edit an RSS feed, but got a few issues with norwegian characters 'ÆØÅ'.

我使用的是在此站点上找到的示例(https://stackoverflow.com/a/14397390/3168446 )

I'm using an example I found at this site ( https://stackoverflow.com/a/14397390/3168446 )

这是我的 feed.xml.(Notepad++ 说它是用 UTF-8 编码的)

This is my feed.xml. (Notepad++ says it's encoded in UTF-8)

<?xml version="1.0" encoding="utf-8"?>
<rss xmlns:atom="http://www.w3.org/2005/Atom" version="2.0">
  <channel>
    <title>My RSS Feed</title>
    <description>This is my RSS Feed</description>
  </channel>
</rss>

我没有使用以下示例,因为它用于 linux 脚本,但我下面的长命令行执行相同的操作..

I'm not using the following example as it's for a linux script, but my long command line below does the same-ish..

#!/bin/sh

TITLE="Test title ÆØÅ"
LINK="http://www.example.com"
DATE="`Sat, 26 Jul 2014 01:14:30 +0200`"

xmlstarlet ed -L   -a "//channel" -t elem -n item -v ""  \
     -s "//item[1]" -t elem -n title -v "$TITLE" \
     -s "//item[1]" -t elem -n link -v "$LINK" \
     -s "//item[1]" -t elem -n pubDate -v "$DATE" \
     -d "//item[position()>10]"  feed.xml ; 

Windows 命令行(我正在使用的):

Windows command line (what I'm using):

xml.exe ed -L -a "//channel" -t elem -n item -v "" -s "//item[1]" -t elem -n title -v "Test title ÆØÅ" -s "//item[1]" -t elem -n link -v "http://www.example.com" -s "//item[1]" -t elem -n pubDate -v "Sat, 26 Jul 2014 01:14:30 +0200" -d "//item[position()>10]" feed.xml

'ÆØÅ' 在我添加包含 'ÆØÅ' 的第二个项目时给我带来了问题,好吧,实际上第一个项目给了我问题,但在添加第二个项目之前不会产生错误消息:

'ÆØÅ' is giving me issues when I add the second item containing 'ÆØÅ', well, actually the first item gives me problems, but doesn't produce an error message until second item is added:

feed.xml:8.23: Input is not proper UTF-8, indicate encoding !
Bytes: 0xC6 0xD8 0xC5 0x3C: Bytes: 0xC6 0xD8 0xC5 0x3C

    <title>Test title ãÏ┼</title>

有人有任何提示吗?我猜这是一个编码问题,但我不明白为什么,因为 feed.xml 是 UTF-8,而提要中的编码设置为 utf-8.

Anyone got any tips? I guess it's an encoding issue, but I don't understand why because feed.xml is UTF-8 and encoding is set to utf-8 in the feed.

推荐答案

我可以确认这个问题在XMLStarlet version 1.6.1+ win32 build中已经​​解决了!

I can confirm this problem is solved in XMLStarlet version 1.6.1+ win32 build!

这篇关于XMLStarlet - UTF-8 北欧字符的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆