如何计算以R开头的数据帧中的单元格百分比? [英] How to calculate percentage of cells in data frame that start with sequence in R?
问题描述
我有数据如下:
行1行2行3行4行5行6行7
abc89 abc62 abc513 abc512 abc81 abc10
abc6 pop abc11 abc4 big 13 abc15
abc90 abc16 abc123 abc33 abc22 abc08 9
111 abc15 abc72 abc36 abc57 abc9 abc55
我想计算以abc开头的数据框中的单元格百分比。例如:这里总共有28个单元格。这可以通过 prod(dim(df))
获得。所以我需要以abc开头的单元格数,然后用 prod(dim(df))
分隔。这里的答案是0.785。如何在R中完成?
我会使用:
> mean(grepl(^ abc,unlist(dat)))
[1] 0.7857143
使用 如果您想按行或列进行排列,您将使用 I have data that looks like: I would like to calculate the percentage of cells in the data frame that begin with "abc". For example: there are 28 total cells here. This can be gotten by I would use: Using If you wanted to do this by row or by column you'd use 这篇关于如何计算以R开头的数据帧中的单元格百分比?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!意味着
意味着您不必单独获取分子和分母。 grepl
是 grep
的逻辑版本 - 它返回 TRUE
每当^ abc
(即,以 abc
开头的字符串)。 回想一下,伯努利向量的平均值是成功的百分比。 / p>
应用
,例如 apply(dat,1,function(x)mean(grepl(^ abc,x)))
获得行方式。Row 1 Row 2 Row 3 Row 4 Row 5 Row 6 Row7
abc89 abc62 67 abc513 abc512 abc81 abc10
abc6 pop abc11 abc4 giant 13 abc15
abc90 abc16 abc123 abc33 abc22 abc08 9
111 abc15 abc72 abc36 abc57 abc9 abc55
prod(dim(df))
. So I need the # of cells that start with "abc" and then divide it by prod(dim(df))
. Here the answer would be 0.785. how can this be done in R?> mean(grepl("^abc",unlist(dat)))
[1] 0.7857143
mean
means you don't have to get the numerator and denominator yourself separately. grepl
is the logical version of grep
--it returns TRUE
whenever "^abc"
(i.e., a string beginning with abc
) is found. Recall that the average of a Bernoulli vector is the percentage of successes.apply
, e.g. apply(dat,1,function(x)mean(grepl("^abc",x)))
to get the row-wise means.