使用Swift,如何在Python中像这样的短脚本重新编码然后解码一个String? [英] Using Swift, how do you re-encode then decode a String like this short script in Python?

查看:38
本文介绍了使用Swift,如何在Python中像这样的短脚本重新编码然后解码一个String?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

XKCD的API出现了一些问题,并出现了奇怪的编码问题.

XKCD has some issues with their API and weird encoding issues.

聊天中xkcd alt文本的次编码问题

解决方案(在Python中)是将其编码为latin1,然后解码为utf8,但是如何在Swift中做到这一点?

The solution (in Python) is to encode it as latin1 then decode as utf8, but how do I do this in Swift?

测试字符串:

"Be careful\u00e2\u0080\u0094it's breeding season"

预期输出:

Be careful—it's breeding season

Python(来自上面的链接):

Python (from above link):

import json
a = '''"Be careful\u00e2\u0080\u0094it's breeding season"'''
print(json.loads(a).encode('latin1').decode('utf8'))

这在Swift中如何完成?

How is this done in Swift?

let strdata = "Be careful\\u00e2\\u0080\\u0094it's breeding season".data(using: .isoLatin1)!
let str = String(data: strdata, encoding: .utf8)

那行不通!

推荐答案

您必须首先解码JSON数据,然后提取字符串,最后修复"字符串.这是一个来自 https://xkcd.com/1814/info的JSON的独立示例.0.json :

You have to decode the JSON data first, then extract the string, and finally "fix" the string. Here is a self-contained example with the JSON from https://xkcd.com/1814/info.0.json:

let data = """
    {"month": "3", "num": 1814, "link": "", "year": "2017", "news": "",
    "safe_title": "Color Pattern", "transcript": "",
    "alt": "\\u00e2\\u0099\\u00ab When the spacing is tight / And the difference is slight / That's a moir\\u00c3\\u00a9 \\u00e2\\u0099\\u00ab",
    "img": "https://imgs.xkcd.com/comics/color_pattern.png",
    "title": "Color Pattern", "day": "22"}
""".data(using: .utf8)!

// Alternatively:
// let url = URL(string: "https://xkcd.com/1814/info.0.json")!
// let data = try! Data(contentsOf: url)

do {
    if let dict = (try JSONSerialization.jsonObject(with: data, options: [])) as? [String: Any],
        var alt = dict["alt"] as? String {

        // Now try fix the "alt" string
        if let isoData = alt.data(using: .isoLatin1),
            let altFixed = String(data: isoData, encoding: .utf8) {
            alt = altFixed
        }

        print(alt)
        // ♫ When the spacing is tight / And the difference is slight / That's a moiré ♫
    }
} catch {
    print(error)
}

如果您只有一个格式的字符串

If you have just a string of the form

请注意\ u00e2 \ u0080 \ u0094是繁殖季节

Be careful\u00e2\u0080\u0094it's breeding season

然后您仍然可以使用 JSONSerialization 解码 \ uNNNN 转义序列,然后如上所述继续.

then you can still use JSONSerialization to decode the \uNNNN escape sequences, and then continue as above.

一个简单的示例(为简便起见,省略了错误检查):

A simple example (error checking omitted for brevity):

let strbad = "Be careful\\u00e2\\u0080\\u0094it's breeding season"
let decoded = try! JSONSerialization.jsonObject(with: Data("\"\(strbad)\"".utf8), options: .allowFragments) as! String
let strgood = String(data: decoded.data(using: .isoLatin1)!, encoding: .utf8)!
print(strgood)
// Be careful—it's breeding season

这篇关于使用Swift,如何在Python中像这样的短脚本重新编码然后解码一个String?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆