如何从字符串中获取单个Unicode字符 [英] How to get a single Unicode character from string

查看:59
本文介绍了如何从字符串中获取单个Unicode字符的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想知道如何从字符串中获取Unicode字符.例如,如果字符串是你好",我如何获得第一个字符你"?

I wonder how I can I get a Unicode character from a string. For example, if the string is "你好", how can I get the first character "你"?

从另一个地方我有一个方法:

From another place I get one way:

var str = "你好"
runes := []rune(str)
fmt.Println(string(runes[0]))

它确实起作用.但是我还有一些问题:

It does work. But I still have some questions:

  1. 还有另一种方法吗?

  1. Is there another way to do it?

为什么在Go中 str [0] 不能从字符串中获取Unicode字符,但是却获取字节数据?

Why in Go does str[0] not get a Unicode character from a string, but it gets byte data?

推荐答案

首先,您可能需要阅读 https://blog.golang.org/strings 它将回答您的部分问题.

First, you may want to read https://blog.golang.org/strings It will answer part of your questions.

Go中的字符串可以包含任意字节.当您写str [i]时,结果是一个字节,索引始终是一个字节数.

A string in Go can contains arbitrary bytes. When you write str[i], the result is a byte, and the index is always a number of bytes.

大多数时候,字符串都是用UTF-8编码的.您可以通过多种方式处理字符串中的UTF-8编码.

Most of the time, strings are encoded in UTF-8 though. You have multiple ways to deal with UTF-8 encoding in a string.

例如,您可以使用for ... range语句由符文对字符串符文进行迭代.

For instance, you can use the for...range statement to iterate on a string rune by rune.

var first rune
for _,c := range str {
    first = c
    break
}
// first now contains the first rune of the string

您还可以利用unicode/utf8软件包.例如:

You can also leverage the unicode/utf8 package. For instance:

r, size := utf8.DecodeRuneInString(str)
// r contains the first rune of the string
// size is the size of the rune in bytes

如果字符串以UTF-8编码,则无法直接访问字符串的第n个符文,因为符文的大小(以字节为单位)不是恒定的.如果需要此功能,则可以轻松编写自己的帮助程序功能(使用for ... range或unicode/utf8软件包).

If the string is encoded in UTF-8, there is no direct way to access the nth rune of the string, because the size of the runes (in bytes) is not constant. If you need this feature, you can easily write your own helper function to do it (with for...range, or with the unicode/utf8 package).

这篇关于如何从字符串中获取单个Unicode字符的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆