word字符n-gram的快速实现 [英] Quick implementation of character n-grams for word

查看：37 发布时间：2021/6/26 18:33:40 python-2.7 n-gram

本文介绍了word字符n-gram的快速实现的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我编写了以下代码来计算字符二元组，输出就在下面.我的问题是，如何获得不包括最后一个字符(即 t)的输出?有没有更快更有效的方法来计算字符 n-gram?

I wrote the following code for computing character bigrams and the output is right below. My question is, how do I get an output that excludes the last character (ie t)? and is there a quicker and more efficient method for computing character n-grams?

b='student'
>>> y=[]
>>> for x in range(len(b)):
    n=b[x:x+2]
    y.append(n)
>>> y
['st', 'tu', 'ud', 'de', 'en', 'nt', 't']

这是我想要得到的结果:['st','tu','ud','de','nt]

Here is the result I would like to get:['st','tu','ud','de','nt]

预先感谢您的建议.

推荐答案

生成二元组:

In [8]: b='student'

In [9]: [b[i:i+2] for i in range(len(b)-1)]
Out[9]: ['st', 'tu', 'ud', 'de', 'en', 'nt']

概括为不同的n:

In [10]: n=4

In [11]: [b[i:i+n] for i in range(len(b)-n+1)]
Out[11]: ['stud', 'tude', 'uden', 'dent']

这篇关于word字符n-gram的快速实现的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

word字符n-gram的快速实现 [英] Quick implementation of character n-grams for word

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

word字符n-gram的快速实现 [英] Quick implementation of character n-grams for word

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭