获取R中两个单词之间的距离 [英] Getting distance between two words in R
问题描述
假设我在文件中有一行:
Say I have a line in a file:
string <- "thanks so much for your help all along. i'll let you know when...."
我想返回一个值,指示单词 know
是否在 help
的 6 个单词之内.
I want to return a value indicating if the word know
is within 6 words of help
.
推荐答案
这本质上是 Crayon 作为基本功能的答案的一个非常粗略的实现:
This is essentially a very crude implementation of Crayon's answer as a basic function:
withinRange <- function(string, term1, term2, threshold = 6) {
x <- strsplit(string, " ")[[1]]
abs(grep(term1, x) - grep(term2, x)) <= threshold
}
withinRange(string, "help", "know")
# [1] TRUE
withinRange(string, "thanks", "know")
# [1] FALSE
<小时>
我建议您对可用的文本工具有一个基本的了解,并使用它们来编写这样的函数.注意 Tyler 的评论:在实施时,这可以匹配多个术语(you"将匹配you"和your"),从而产生有趣的结果.您需要确定要如何处理这些情况才能获得更有用的功能.
I would suggest getting a basic idea of the text tools available to you, and using them to write such a function. Note Tyler's comment: As implemented, this can match multiple terms ("you" would match "you" and "your") leading to funny results. You'll need to determine how you want to deal with these cases to have a more useful function.
这篇关于获取R中两个单词之间的距离的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!