r :需要由 tm_map() 调用的 content_transformer() 将非字母更改为空格 [英] r : Need content_transformer() called by tm_map() to change non-letters to spaces

查看：44 发布时间：2021/9/6 19:43:50 r text-mining

本文介绍了r :需要由 tm_map() 调用的 content_transformer() 将非字母更改为空格的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

在下面的代码中，任何匹配/|@| \|")的字符都会被改成空格.

In the following code, any characters matching "/|@| \|") will be changed to a space.

> library(tm)
> toSpace <- content_transformer(function(x, pattern) gsub(pattern, " ", x))
> docs <- tm_map(docs, toSpace, "/|@| \\|")

什么代码会将所有非字母转换为空格?(下面 xxxxx 的位置是什么.)

What code would transform all non-letters to a space? (What goes where the xxxxx's are below.)

将所有非字母放在一个字符串中是非常困难的......(很长的列表，一些不可打印的，加上转义字符的东西.)所以我正在做与上述相反的事情.

It is very difficult to put all non-letters in a string... (Very long list, some non-printable, plus the escaping characters things.) So I'm doing the opposite of the above.

> toSpace_2 <- content_transformer(function xxxxxxxxxxxxxxxxxxxxxxx))
> docs <- tm_map(docs, toSpace_2, 
"a|b|c|d|e|f|g|h|i|j|k|l|m|n|o|p|q|r|s|t|u|v|w|x|y|z")

这需要通过 content_transformer() 函数来完成，以保持文档的完整性.这必须非常简单...

This needs to be done by a content_transformer() function to maintain the integrity of docs. This has to be really simple...

谢谢

r :需要由 tm_map() 调用的 content_transformer() 将非字母更改为空格 [英] r : Need content_transformer() called by tm_map() to change non-letters to spaces

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

r :需要由 tm_map() 调用的 content_transformer() 将非字母更改为空格 [英] r : Need content_transformer() called by tm_map() to change non-letters to spaces

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭