如何在R编程中处理多组数据? [英] How to handle more than multiple sets of data in R programming?

查看:132
本文介绍了如何在R编程中处理多组数据?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

钙 数据<-cut(数据$时间,breaks = seq(0,最大值(数据$时间)+400,400)) by(数据$催产素,削减,平均)

Ca data <- cut(data$Time, breaks=seq(0, max(data$Time)+400, 400))  by(data$Oxytocin, cuts, mean)

但是这仅适用于一个人的数据....但是我有十个人拥有自己的时间和催产素数据....我如何同时获得他们的平均值?而不是这种类型的输出:

but this would only work for only one person's data....But I have ten people with their own Time and oxytocin data....How would I get their averages simultaneously? Also instead of having this type output :

cuts: (0,400]
[1] 0.7
------------------------------------------------------------ 
cuts: (400,800]
[1] 0.805

有没有办法获得这些削减的清单?

Is there a way I can get a list of those cuts?

推荐答案

以下是使用IRanges包的解决方案.

Here's a solution using IRanges package.

idx假定您的数据格式为TimedataTimedata,...等.因此,它将创建索引1,3,5,...ncol(df)-1.

idx assumes your data format is Time, data, Time, data, ... and so on.. So, it creates indices 1,3,5,...ncol(df)-1.

ir1是您想要平均值的间隔.宽度为400.每个时间"列(此处为第1列和第3列)的范围从0到max(Time).

ir1 is the intervals you would want the mean for. It's width is 400. It goes from 0 to max(Time) for each Time column (here columns 1 and 3).

ir2是相应的间隔宽度= 1的时间"列.

ir2 is the corresponding Time column of interval width = 1.

然后我得到ir1ir2的交叠,这基本上告诉我ir2的哪些区间与ir1交叠(我们想要),从中我计算出平均值并输出data.frame.

Then I get the overlaps of ir1 with ir2, which basically tells me which intervals from ir2 overlap with ir1 (which we want), from which I calculate the mean and output the data.frame.

idx <- seq(1, ncol(df), by=2)
o <- lapply(idx, function(i) {  
    ir1 <- IRanges(start=seq(0, max(df[[i]]), by=401), width=401)
    ir2 <- IRanges(start=df[[i]], width=1)
    t <- findOverlaps(ir1, ir2)
    d <- data.frame(mean=tapply(df[[i+1]], queryHits(t), mean))
    cbind(as.data.frame(ir1), d)
})

> o
# [[1]]
#   start  end width      mean
# 1     0  400   401 0.6750000
# 2   401  801   401 0.8050000
# 3   802 1202   401 0.8750000
# 4  1203 1603   401 0.2285333

# [[2]]
#   start  end width    mean
# 1     0  400   401 0.73508
# 2   401  801   401 0.13408
# 3   802 1202   401 0.26408
# 4  1203 1603   401 1.06408
# 5  1604 2004   401 3.06408

对于每个Time列,您将获得一个列表,其中包含间隔和该间隔的平均值.

For each Time column, you'll get a list with the intervals and mean for that interval.

这篇关于如何在R编程中处理多组数据?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆