使用循环在列表之间查找匹配的单词 [英] Using loops to find matching words between lists

查看:71
本文介绍了使用循环在列表之间查找匹配的单词的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有两个数据集.一个是72个项目的列表,其中每个项目本身就是一个由10个句子组成的列表.因此,我总共有720个句子,每个句子由10个列表分隔.

I have two datasets. One is a list of 72 items, where each item is a list itself that consists of 10 sentences. Therefore, I have a total of 720 sentences, each separated by lists of 10.

第二组数据是第一个数据集中以"ing"结尾的所有单词的列表.

The second set of data is a list of all the words in the first dataset that end in "ing".

我想为每个列表项查看,该列表的十个句子中是否包含"ing"字.

I want to see for each list item, if an "ing" word is contained in any of the ten sentences of said list.

如果是这样,则列表中存在哪些ing单词,这是单词首次在数据集中整体出现(即,它首次出现在所有720个句子中)吗?然后,我计划将所有这些信息汇总到一个表中

If so, what ing words are present in the list and is this the first time the word appears in the dataset overall (i.e., first time it shows up across all 720 sentences)? I then plan to compile all that information into a table

这是我到目前为止所拥有的.我只是想看看它是否可以打印出在每个复杂单词之前找到的每个单词的列表.

This is what I have so far. I just wanted to see if it would print what lists each ing word was found in, before moving onto the more complicated parts.

n <- 1

harvardList[1]
for(word in IngWords){
  if(IngWords==harvardList[n])
  print(harvardList[n])
  n <- n+1
}

运行该脚本时,出现以下错误并输出:

When I run that script I get these errors and output:

Error: unexpected 'in' in:
"for(word in IngWords){
  if(word in"
 print(harvardList[n])
$`List 1`
$`List 1`[[1]]
[1] "The birch canoe slid on the smooth planks."

etc., 

>   n <- n+1
> }
Error: unexpected '}' in "}"

这是句子列表的迷你版:

This is a mini version of the sentence list:

$`List 1`[[1]]
[1] "The source of the huge river is the clear spring."

$`List 1`[[2]]
[1] "Help the woman get back to her feet."

$`List 1`[[3]]
[1] "A pot of tea helps to pass the evening."

$`List 2`[[1]]
[1] "The colt reared and threw the tall rider."

$`List 2`[[2]]
[1] "It snowed, rained, and hailed the same morning."

$`List 2`[[3]]
[1] "Read verse out loud for pleasure."

$`List 3`[[1]]
[1] "Take the winding path to reach the lake."

$`List 3`[[2]]
[1] "The meal was cooked before the bell rang."

$`List 3`[[3]]
[1] "What joy there is in living."

以下是这些单词:

早晨早春宜人的生活

预期输出:

[List Number] [ing-word]
1             spring, evening
2             morning
3             winding, living

推荐答案

我们可以使用lapply遍历列表中的每个元素,在空间上拆分每个单词,删除标点符号,然后找到IngWords中存在的单词

We can loop over each element in list using lapply, split every word on space, remove the punctuations and find the words which are present in IngWords.

stack(lapply(harvardList, function(x) {
   all_words <- gsub("[[:punct:]]", "", unlist(strsplit(unlist(x), " ")))
   toString(all_words[all_words %in% IngWords])
}))[2:1]


#    ind          values
#1 List1 spring, evening
#2 List2         morning
#3 List3 winding, living

数据

harvardList <- list(List1= list("The source of the huge river is the clear spring.",
              "Help the woman get back to her feet.", 
              "A pot of tea helps to pass the evening."), 
 List2 = list("The colt reared and threw the tall rider.", 
              "It snowed, rained, and hailed the same morning.", 
              "Read verse out loud for pleasure."), 
 List3 = list( "Take the winding path to reach the lake.", 
               "The meal was cooked before the bell rang.", 
               "What joy there is in living."))
IngWords <- c("living", "winding", "morning", "evening", "spring")

这篇关于使用循环在列表之间查找匹配的单词的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆