如何获取字符串中的字符数？ [英] How to get the number of Characters in a String?

查看：195 发布时间：2016/11/18 15:34:13 string go character string-length

本文介绍了如何获取字符串中的字符数？的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

如何获取Go中字符串的字符数？

例如，如果我有一个字符串hello该方法应该返回 5 。我看到 len（str）返回字节数而不是字符数，因此 len ）返回2而不是1，因为£在UTF-8中用两个字节编码。

解决方案

您可以从utf8软件包尝试 RuneCountInString 。 / p>

返回p中的符文数目

，如此脚本中所示：世界的长度可能为6（以中文书写：世界），但其符文数为2：

  package main 
 
 importfmt
 importunicode / utf8
 
 func main（）{
 fmt.Println（Hello，World，len（世界），utf8.RuneCountInString ）
}

Phrozen 在评论中添加： / p>

其实你可以通过类型强制转换来执行 len（）。

len（[] rune（世界））将打印 2 。

Stefan Steiger 指向博客文章 Go 中的文字正常化

什么是字符？

如 strings blog post ，字符可以跨多个符文。

例如，' e '和'◌◌'（急性\\\́）可以组合形成'é'（ e\\\́ ）。这两个符文是一个字符。
字符的定义可能因应用程序而异。对于正规化，我们将其定义为：不修改或与任何其他符文反向组合的符文空序列的非启动器，即符文（通常是口音）。标准化算法会同时处理一个字符。

使用该软件包及其 Iter 类型，character的实际数字将是：

  package main 
 
 importfmt
 importgolang.org/x/text/unicode/norm
 
 func main（）{
 var ia norm.Iter 
 ia.InitString（norm.NFKD，école）
 nc：= 0 
 for！ia.Done 
 nc = nc + 1 
 ia.Next（）
} 
 fmt.Printf（字符数：％d \\\
，nc）
}

这里使用 Unicode规范化表单 NFKD兼容性分解

 
How can I get the number of characters of a string in Go?

For example, if I have a string "hello" the method should return 5. I saw that len(str) returns the number of bytes and not the number of characters so len("£") returns 2 instead of 1 because £ is encoded with two bytes in UTF-8.
 解决方案 
You can try RuneCountInString from the utf8 package.

  returns the number of runes in p
that, as illustrated in this script: the length of "World" might be 6 (when written in Chinese: "世界"), but its rune count is 2:
package main

import "fmt"
import "unicode/utf8"

func main() {
    fmt.Println("Hello, 世界", len("世界"), utf8.RuneCountInString("世界"))
}
Phrozen adds in the comments:

Actually you can do len() over runes by just type casting.

len([]rune("世界")) will print 2. At leats in Go 1.3.



Stefan Steiger points to the blog post "Text normalization in Go"

What is a character?

  As was mentioned in the strings blog post, characters can span multiple runes.

  For example, an 'e' and '◌́◌́' (acute "\u0301") can combine to form 'é' ("e\u0301" in NFD). Together these two runes are one character. 
  
  The definition of a character may vary depending on the application.

  For normalization we will define it as:
  
  
  a sequence of runes that starts with a starter, 
  a rune that does not modify or combine backwards with any other rune, 
  followed by possibly empty sequence of non-starters, that is, runes that do (typically accents). 
  
  
  The normalization algorithm processes one character at at time. 
Using that package and its Iter type, the actual number of "character" would be:
package main

import "fmt"
import "golang.org/x/text/unicode/norm"

func main() {
    var ia norm.Iter
    ia.InitString(norm.NFKD, "école")
    nc := 0
    for !ia.Done() {
        nc = nc + 1
        ia.Next()
    }
    fmt.Printf("Number of chars: %d\n", nc)
}
Here, this uses the Unicode Normalization form NFKD "Compatibility Decomposition"

                        这篇关于如何获取字符串中的字符数？的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！


                    
                        查看全文

如何获取字符串中的字符数？ [英] How to get the number of Characters in a String?

问题描述

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

如何获取字符串中的字符数？ [英] How to get the number of Characters in a String?

问题描述

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭