在R中拆分字符串并生成频率表 [英] Splitting Strings and Generating Frequency Tables in R

查看：75 发布时间：2020/11/10 23:04:15 r string split frequency

本文介绍了在R中拆分字符串并生成频率表的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我在 R 数据框中有一列公司名称，其内容如下:

I have a column of firm names in an R dataframe that goes something like this:

"ABC Industries"  
"ABC Enterprises"  
"123 and 456 Corporation"  
"XYZ Company"

以此类推.我正在尝试为出现在此列中的每个单词生成频率表，因此，例如:

And so on. I'm trying to generate frequency tables of every word that appears in this column, so for example, something like this:

Industries   10  
Corporation  31  
Enterprise   40  
ABC          30  
XYZ          40

我是 R 的新手，所以我想知道一种解决此问题的好方法.我应该拆分字符串并将每个不同的单词放入新的列中吗?有没有一种方法可以将一个单词的多单词行拆分为多个行?

I'm relatively new to R, so I was wondering of a good way to approach this. Should I be splitting the strings and placing every distinct word into a new column? Is there a way to split up a multi-word row into multiple rows with one word?

推荐答案

如果您愿意，可以单线执行:

If you wanted to, you could do it in a one-liner:

R> text <- c("ABC Industries", "ABC Enterprises", 
+            "123 and 456 Corporation", "XYZ Company")
R> table(do.call(c, lapply(text, function(x) unlist(strsplit(x, " ")))))

        123         456         ABC         and     Company 
          1           1           2           1           1 
Corporation Enterprises  Industries         XYZ 
          1           1           1           1 
R>

在这里，我使用strsplit()破坏每个条目的介绍性组件；这将返回一个列表(在列表内).我使用do.call()，因此只需将所有结果列表连接到一个向量中，即可table()汇总.

Here I use strsplit() to break each entry intro components; this returns a list (within a list). I use do.call() so simply concatenate all result lists into one vector, which table() summarises.

这篇关于在R中拆分字符串并生成频率表的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

在R中拆分字符串并生成频率表 [英] Splitting Strings and Generating Frequency Tables in R

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

在R中拆分字符串并生成频率表 [英] Splitting Strings and Generating Frequency Tables in R

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭