R 计数单词出现在列表元素中的次数 [英] R count times word appears in element of list

查看:33
本文介绍了R 计数单词出现在列表元素中的次数的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个由单词组成的列表.

I have a list comprised of words.

> head(splitWords2)
[[1]]
 [1] "Some"        "additional"  "information" "that"        "we"          "would"       "need"        "to"          "replicate"   "the"        
[11] "experiment"  "is"          "how"         "much"        "vinegar"     "should"      "be"          "placed"      "in"          "each"       
[21] "identical"   "container"   "or"          "what"        "tool"        "use"         "measure"     "mass"        "of"          "four"       
[31] "different"   "samples"     "and"         "distilled"   "water"       "rinse"       "after"       "taking"      "them"        "out"        

[[2]]
 [1] "After"       "reading"     "the"         "expirement"  "I"           "realized"    "that"        "additional"  "information" "you"        
[11] "need"        "to"          "replicate"   "expireiment" "is"          "one"         "amant"       "of"          "vinegar"     "poured"     
[21] "in"          "each"        "container"   "two"         "label"       "containers"  "before"      "start"       "yar"         "and"        
[31] "three"       "write"       "a"           "conclusion"  "make"        "sure"        "results"     "are"         "accurate" 

我有一个单词向量,我想计算列表中每个元素的出现次数,而不是整个列表中出现的总次数.

I have a vector of words that I want to count the occurrences of in EACH element of the list, NOT the total number of occurrences in the entire list.

我认为这样做的方法是结合 stringr 包中的 str_count() 函数和 *ply() 函数之一code> 函数,但我不能让它工作.

I think the way to do it is a combination of the str_count() function from the stringr package and one of the *ply() functions, but I can't make it work.

numWorder1 <- sapply(ifelse(str_count(unlist(splitWords2), ignore.case("we" ) )> 0, 1, 0))

其中我们"最终将是来自单词向量的单词,以计算 的出现次数.

where "we" will eventually be a word from a vector of words to count occurrences of .

我理想的输出是这样的:

My ideal output would be something like:

lineNum       count
   1           0
   2           1
   3           1
   4           0
  ...         ...

有什么建议吗?

推荐答案

对于一个特定的词:

words <- list(a = c("a","b","c","a","a","b"), b = c("w","w","q","a"))
$a
[1] "a" "b" "c" "a" "a" "b"

$b
[1] "w" "w" "q" "a"
wt <- data.frame(lineNum = 1:length(words))
wt$count <- sapply(words, function(x) sum(str_count(x, "a")))
  lineNum count
1       1     3
2       2     1

如果向量 w 包含要计数的单词:

If vector w contains words that you want to count:

w <- c("a","q","e")
allwords <- lapply(w, function(z) data.frame(lineNum = 1:length(words), 
            count = sapply(words, function(x) sum(str_count(x, z)))))
names(allwords) <- w
$a
  lineNum count
a       1     3
b       2     1

$q
  lineNum count
a       1     0
b       2     1

$e
  lineNum count
a       1     0
b       2     0

这篇关于R 计数单词出现在列表元素中的次数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆