根据较大的矩阵计算来自多个位置的4个子位置组的平均值 [英] Calculate mean value of sets of 4 sub locations from multiple location from a larger matrix

查看:107
本文介绍了根据较大的矩阵计算来自多个位置的4个子位置组的平均值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在做一个关于圆管壁厚测量的数据分析。我有以下矩阵:

 > head(datIn,12)

组件Tube.number测量位置子位置间隔单位开始
1 In 1 1 A 121 U6100 7/25/2000
2 In 1 1 A 122 U6100 5/24/2001
3 In 1 1 A 222 U6200 1/19/2001
4 In 1 1 A 321 U6300 6/1/2000
5 In 1 1 A 223 U6200 5/22/2002
6 In 1 1 A 323 U6300 6/18/2002
7 In 1 1 A 21 U6200 10/1/1997
8 In 1 1 A 221 U6200 6/3/2000
9 In 1 1 A 322 U6300 12/11/2000
10 In 1 1 B 122 U6100 5/24/2001
11 In 1 1 B 322 U6300 12/11/2000
12 In 1 1 B 21 U6200 10/1/1997

End Measurement Material.loss Material.loss.interval Run.hours.interval
1 5/11/2001 7.6 0.4 NA 6653.10
2 2/7/2004 6.1 1.9 1.5 15484.82
3 3 / 7/2002 8.5 -0.5 -0.5 8826.50
4 12/1/2000 7.8 0.2 0.2 4170.15
5 4/30/2003 7.4 0.6 1.1 6879.73
6 9/30/2003 7.9 0.1 - 0.1 9711.56
7 4/20/2000 7.6 0.4不适用15159.94
8 1/5/2001 8.0 0.0 -0.4 4728.88
9 5/30/2002 7.8 0.2 0.0 9829.75
10 2/7/2004 5.9 2.1 0.9 15484.82
11 5/30/2002 7.0 1.0 0.7 9829.75
12 2000年4月20日8.2 -0.2不适用15159.94

Run.hours.prior.to.interval Total.run.hours.end.interval
1 0.00 6653.10
2 6653.10 22137.92
3 19888.82 28715.32
4 0.00 4170.15
5 28715.32 35595.05
6 30039.58 39751.14
7 0.00 15159.94
8 15159.94 19888.82
9 2020 9.83 30039.58
10 6653.10 22137.92
20209.83 30039.58
12 0.00 15159.94

Straight.or.In.Out.Middle.bend.1 Straight.or.In。 Out.Middle.bend.2
1 Out Out
Out Out
Out Out
4 Out Out
5 Out Out
6 Out Out
7外出
8外出
9外出
10中间出
11出中
12出中

子位置列具有值A,B,C,D。它们是在同一测量位置处的测量值,但是在横截面中的不同位置处。所以在0°,90°,180°,270°的角度处。

我想绘制一个图表,明确哪个测量位置的壁厚减少最大为了做到这一点,我首先要计算每个测量位置在每个独特的时间间隔内管壁厚的平均值(运行时间是耦合的

我试着用下面的公式来做这件事:

  par(mfrow = c(1,2))
myfunction< - function(mydata1){return(mean(mydata1,na.rm = TRUE))}
AVmeasloc< - tapply (datIn $ Measurement,list(as.factor(datIn $ Sub.location),as.factor(datIn $ Measurement.location),myfunction))
AVmeasloc

这似乎不起作用。我想保留tapply函数,因为我也用它计算了一些值的标准偏差,它可以让我轻松地绘制图。



有没有人有任何建议如何解决这个问题?

解决方案

从您发布的代码中,list()有一个括号错误,它应该读取


$ b $ pre $ AVmeasloc< - tapply(datIn $ Measurement,list(as.factor(datIn $ Sub.location),as.factor (datIn $ Measurement.location)),myfunction)

现在可以清理到

  AVmeasloc < -  tapply(datIn $ Measurement,datIn [,c(3,4)],mean,na.rm = TRUE) 

下面是一个工作示例:

<$ p $ (A,A,A,B,B,B,C, C,D),
cat2 = c(1,1,2,2,1,NA,2,1,1),
val = c(0.1,0.2,0.3, 0.4,0.5,0.6,0.7,0.8,0.9))


tapply(test.dat a $ val,test.data [,c(1,2)],意思是,na.rm = TRUE)

cat2
cat1 1 2
A 0.15 0.3
B 0.50 0.4
C 0.80 0.7
D 0.90 NA


I am doing a data analysis on wall thickness measurements of circular tubes. I have the following matrix:

> head(datIn, 12)

        Component Tube.number Measurement.location Sub.location Interval  Unit      Start
1         In           1                    1            A      121      U6100  7/25/2000
2         In           1                    1            A      122      U6100  5/24/2001
3         In           1                    1            A      222      U6200  1/19/2001
4         In           1                    1            A      321      U6300   6/1/2000
5         In           1                    1            A      223      U6200  5/22/2002
6         In           1                    1            A      323      U6300  6/18/2002
7         In           1                    1            A       21      U6200  10/1/1997
8         In           1                    1            A      221      U6200   6/3/2000
9         In           1                    1            A      322      U6300 12/11/2000
10        In           1                    1            B      122      U6100  5/24/2001
11        In           1                    1            B      322      U6300 12/11/2000
12        In           1                    1            B       21      U6200  10/1/1997

        End Measurement Material.loss Material.loss.interval Run.hours.interval
1  5/11/2001         7.6           0.4                     NA            6653.10
2   2/7/2004         6.1           1.9                    1.5           15484.82
3   3/7/2002         8.5          -0.5                   -0.5            8826.50
4  12/1/2000         7.8           0.2                    0.2            4170.15
5  4/30/2003         7.4           0.6                    1.1            6879.73
6  9/30/2003         7.9           0.1                   -0.1            9711.56
7  4/20/2000         7.6           0.4                     NA           15159.94
8   1/5/2001         8.0           0.0                   -0.4            4728.88
9  5/30/2002         7.8           0.2                    0.0            9829.75
10  2/7/2004         5.9           2.1                    0.9           15484.82
11 5/30/2002         7.0           1.0                    0.7            9829.75
12 4/20/2000         8.2          -0.2                     NA           15159.94

 Run.hours.prior.to.interval Total.run.hours.end.interval
1                         0.00                      6653.10
2                      6653.10                     22137.92
3                     19888.82                     28715.32
4                         0.00                      4170.15
5                     28715.32                     35595.05
6                     30039.58                     39751.14
7                         0.00                     15159.94
8                     15159.94                     19888.82
9                     20209.83                     30039.58
10                     6653.10                     22137.92
11                    20209.83                     30039.58
12                        0.00                     15159.94

Straight.or.In.Out.Middle.bend.1 Straight.or.In.Out.Middle.bend.2
1                               Out                              Out
2                               Out                              Out
3                               Out                              Out
4                               Out                              Out
5                               Out                              Out
6                               Out                              Out
7                               Out                              Out
8                               Out                              Out
9                               Out                              Out
10                           Middle                              Out
11                           Middle                              Out
12                           Middle                              Out

The Sub.location column has values A, B, C, D. They are measurements at the same measurement location but at a different position in the cross section. So at 0, 90, 180, 270 degrees along the tube.

I would like to make a plot in which it becomes clear which measurement location has the biggest wall thickness decrease in time.

To do this I first want to calculate the mean value of the wall thickness of a tube at each measurement location at each unique interval (the running hours are coupled to the interval).

I tried doing this with the following formula:

par(mfrow=c(1,2))
myfunction <- function(mydata1) { return(mean(mydata1,na.rm=TRUE))}
AVmeasloc <- tapply(datIn$Measurement,list(as.factor(datIn$Sub.location),as.factor(datIn$Measurement.location), myfunction))
AVmeasloc

This doesnt seem to work. I would like to keep the tapply function as I also calculated the standard deviation for some values with this and it lets me make plots easily.

Does anyone have any advice how to tackle this problem?

解决方案

From the code you've post, there is a parenthesis error around list(), it should read

AVmeasloc <-  tapply(datIn$Measurement,list(as.factor(datIn$Sub.location),as.factor(datIn$Measurement.location)), myfunction)

This can now be cleaned up to

AVmeasloc <- tapply(datIn$Measurement,datIn[,c(3,4)],mean,na.rm=TRUE)

Here's a working example:

test.data <- data.frame(cat1 = c("A","A","A","B","B","B","C","C","D"),
                    cat2 = c(1,1,2,2,1,NA,2,1,1),
                    val = c(0.1,0.2,0.3,0.4,0.5,0.6,0.7,0.8,0.9))


tapply(test.data$val, test.data[,c(1,2)],mean,na.rm=TRUE)

    cat2
cat1    1   2
   A 0.15 0.3
   B 0.50 0.4
   C 0.80 0.7
   D 0.90  NA

这篇关于根据较大的矩阵计算来自多个位置的4个子位置组的平均值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆