R Tidytext 和 unnest_tokens 错误 [英] R Tidytext and unnest_tokens error

查看:66
本文介绍了R Tidytext 和 unnest_tokens 错误的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

刚接触 R 并且已经开始使用 tidytext 包.

Very new to R and have started to use the tidytext package.

我正在尝试使用参数输入 unnest_tokens 函数,以便我可以进行多列分析.所以代替这个

I'm trying to use arguments to feed into the unnest_tokens function so I can do multiple column analysis. So instead of this

library(janeaustenr)
library(tidytext)
library(dplyr)
library(stringr)

original_books <- austen_books() %>%
  group_by(book) %>%
  mutate(linenumber = row_number(),
         chapter = cumsum(str_detect(text, regex("^chapter [\\divxlc]",
                                                 ignore_case = TRUE)))) %>%
  ungroup()

original_books

tidy_books <- original_books %>%
              unnest_tokens(word, text)

最后一行代码是:

output<- 'word'
input<- 'text'

tidy_books <- original_books %>%
              unnest_tokens(output, input)

但我明白了:

check_input(x) 中的错误:输入必须是任意长度的字符向量或字符列表向量,每个向量的长度为 1.

Error in check_input(x) : Input must be a character vector of any length or a list of character vectors, each of which has a length of 1.

我尝试过使用 as.character() ,但运气不佳.

I've tried using as.character() without much luck.

关于这将如何运作有任何想法吗?

Any ideas on how this would work?

推荐答案

尝试

tidy_books <- original_books %>% 
              unnest_tokens_(output, input)

unnest_tokens_ 中使用下划线.

unnest_tokens_unnest_tokens 的标准评估"版本,允许您将变量名称作为字符串传递.有关标准与非标准评估的讨论,请参阅非标准评估非标准评价.

unnest_tokens_ is the "standard evaluation" version of unnest_tokens, and allows you to pass in variable names as strings. See Non-standard evaluation for a discussion of standard vs non-standard evaluation.

这篇关于R Tidytext 和 unnest_tokens 错误的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆