R如何计算跨数据帧多列的值的出现次数,并将特定值的按列计数保存为新行? [英] R How to count occurrences of values across multiple columns of a data frame and save the columnwise counts from a particular value as a new row?
问题描述
我有一个很大的数据框(大约1,000行和30,000列),看起来像这样:
I have a large data-frame (approx 1,000 rows and 30,000 columns) that looks like this:
chr pos sample1 sample2 sample3 sample 4
1 5050 1 NA 0 0.5
1 6300 1 0 0.5 1
1 7825 1 0 0.5 1
1 8200 0.5 0.5 0 1
其中,在给定的 chr和 pos下,给定样本的值可以采用0、0.5, 1,或NA。我要执行大量查询,这些查询需要根据每个样本的值摘要对数据框进行设置和排序。
where at a given "chr"&"pos" the value for a given sample can take the form of 0, 0.5, 1, or NA. I have a large number of queries to perform that will require subsetting and ordering the data frame based on summaries of the values for each sample.
我想获得一个计数每列给定值(例如0.5)的出现次数,并将其另存为我的数据框中的新行。我的最终目标是能够使用新行的值对数据框的列进行子集和/或排序。我已经看到过类似的关于计数出现次数的问题,但是我似乎找不到/认识到一种解决方案,可以同时在所有列上执行此操作,并将特定值的列计数保存为新行。
I would like to get a count of the number of occurrences of a given value (e.g. 0.5) for each column, and save that as a new row in my data frame. My ultimate goal is to be able to use the values of the new row to subset and/or order the columns of my data frame. I've seen similar questions about counting occurrences, but I can't seem to find/recognize a solution to doing this across all columns simultaneously and saving the column-wise counts for a particular value as a new row.
推荐答案
您可以将函数应用于data.frame的所有列。假设您要计算data.frame d的每一列中的 A数。d
you can apply a function to all the column of you data.frame. Suppose you want to count the number of 'A' in each column of the data.frame d
#a sample data.frame
L3 <- LETTERS[1:3]
(d <- data.frame(cbind(x = 1, y = 1:10), fac = sample(L3, 10, replace = TRUE)))
# the function you are looking for
apply(X=d,2,FUN=function(x) length(which(x=='A')))
这篇关于R如何计算跨数据帧多列的值的出现次数,并将特定值的按列计数保存为新行?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!