如何计算以R开头的数据帧中的单元格百分比? [英] How to calculate percentage of cells in data frame that start with sequence in R?

查看:136
本文介绍了如何计算以R开头的数据帧中的单元格百分比?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有数据如下:

 行1行2行3行4行5行6行7 
abc89 abc62 abc513 abc512 abc81 abc10
abc6 pop abc11 abc4 big 13 abc15
abc90 abc16 abc123 abc33 abc22 abc08 9
111 abc15 abc72 abc36 abc57 abc9 abc55

我想计算以abc开头的数据框中的单元格百分比。例如:这里总共有28个单元格。这可以通过 prod(dim(df))获得。所以我需要以abc开头的单元格数,然后用 prod(dim(df))分隔。这里的答案是0.785。如何在R中完成?

解决方案

我会使用:

 > mean(grepl(^ abc,unlist(dat)))
[1] 0.7857143

使用意味着意味着您不必单独获取分子和分母。 grepl grep 的逻辑版本 - 它返回 TRUE 每当^ abc(即,以 abc 开头的字符串)。 回想一下,伯努利向量的平均值是成功的百分比。 / p>

如果您想按行或列进行排列,您将使用应用,例如 apply(dat,1,function(x)mean(grepl(^ abc,x)))获得行方式。


I have data that looks like:

Row 1     Row 2     Row 3     Row 4     Row 5     Row 6     Row7
abc89     abc62     67        abc513    abc512    abc81     abc10
abc6      pop       abc11     abc4      giant     13        abc15
abc90     abc16     abc123    abc33     abc22     abc08     9
111       abc15     abc72     abc36     abc57     abc9      abc55

I would like to calculate the percentage of cells in the data frame that begin with "abc". For example: there are 28 total cells here. This can be gotten by prod(dim(df)). So I need the # of cells that start with "abc" and then divide it by prod(dim(df)). Here the answer would be 0.785. how can this be done in R?

解决方案

I would use:

> mean(grepl("^abc",unlist(dat)))
[1] 0.7857143

Using mean means you don't have to get the numerator and denominator yourself separately. grepl is the logical version of grep--it returns TRUE whenever "^abc" (i.e., a string beginning with abc) is found. Recall that the average of a Bernoulli vector is the percentage of successes.

If you wanted to do this by row or by column you'd use apply, e.g. apply(dat,1,function(x)mean(grepl("^abc",x))) to get the row-wise means.

这篇关于如何计算以R开头的数据帧中的单元格百分比?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆