制作矩阵数字和命名 [英] Making matrix numeric and name orders

查看:111
本文介绍了制作矩阵数字和命名的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有以下数据:

  yvar<  -  c(1:150)
复制& - c(rep(c(rep(1,10),rep(2,10),rep(3,10)),5))
基因型< 1:10,sep =),15))
environments< - c(rep(paste(E,5:1,sep =),each = 30))
mydf1< - data.frame(yvar,replication,genotypes,environments)
mydf1 $ replication< - as.factor(mydf1 $ replication)

我想汇总数据:

  mydf = data.frame(aggregate (yvar〜genotypes + environments,data = mydf1,mean))

现在创建一个矩阵, ,matm不是!

  matm = as.matrix(aggregate(yvar〜genotypes,mydf,'c'))
colnames )c(基因型,级别(mydf $环境))
基因型E1 E2 E3 E4 E5
[1,]G113110171 11
[2,]G10140110805020
[3,]G213210272 4212
[4,]G3133103734313
[5,]G4134104 4414
[6,]G5135105754515
[7,]G6136 764616
[8,]G7137107774717
[9,]G8138 108784818
[10,]G9139109794919
pre>

我转换成data.frame,然后

  matd< ;  -  data.frame(matm)

基因型E1 E2 E3 E4 E5
1 G1 31.70000 26.76667 23.60000 30.73333 43.13333
2 G10 32.40 000 17.86667 28.83333 32.43333 30.23333
3 G2 29.50000 24.60000 24.16667 33.43333 38.66667
4 G3 27.00000 28.83333 33.63333 43.83333 29.60000
5 G4 29.53333 29.90000 26.60000 26.13333 40.33333
6 G5 27.40000 32.43333 27.96667 40.43333 41.46667
7 G6 36.76667 32.26667 28.26667 38.73333 33.43333
8 G7 29.63333 27.00000 26.96667 34.90000 40.70000
9 G8 24.50000 23.26667 22.50000 27.60000 32.26667
10 G9 31.60000 24.96667 24.46667 27.56667 36.26667

我想摆脱基因型列,然后将其转换为矩阵

  matx = data.frame(matd [, -  1])$ ​​b $ b matdm<  -  as.matrix(matx)
matdm
E1 E2 E3 E4 E5
[1,]31.7000026.7666723.6000030.7333343.13333
[2,]32.4000017.8666728.8333332.4333330.23333
[3,]29.5000024.6000024.1666733.43333 38.66667
[4,]27.0000028.8333333.6333343.8333329.60000
[5,]29.5333329.9000026.6000026.1333340.33333
[6,]27.4000032.4333327.9666740.4333341.46667
[7,]36.7666732.2666728.2666738.7333333.43333
[ 8,]29.6333327.0000026.9666734.9000040.70000
[9,]24.5000023.2666722.5000027.6000032.26667
[10,] 31.6000024.9666724.4666727.5666736.26667

我有两个问题: p>

(1)有一致的方式来制作/分配矩阵数字



(2)我可以看到基因型列名称按字母顺序排列。我的文件在列中有不同的顺序。如果这是一致的,我很好,但是我担心以下部分:

  colnames(matm)< -  c(基因型,级别(mydf $ environment))

如果有不同的顺序聚合函数和级别(mydf $ environments),它们是按字母顺序排列还是在文件中排序。



感谢您的建议。

解决方案

我认为我看到混乱来自哪里。稍微备份,当你进行聚合时你想变成一个矩阵;尝试捕获并查看它:

  myAgg<  - 聚合(yvar〜基因型,mydf,'c')
str(myAgg)

产生:

 > str(myAgg)
'data.frame':10 obs。的2个变量:
$基因型:因子w / 10级别G1,G10,G2,...:1 2 3 4 5 6 7 8 9 10
$ yvar:num [ 1:10,1:5] 131 140 132 133 134 135 136 137 138 139 ...

因此,聚合产生了一些非典型的数据框架。列 yvar 实际上是您感兴趣的矩阵:

 > ; myAgg $ yvar 
[,1] [,2] [,3] [,4] [,5]
[1,] 131 101 71 41 11
[2,] 140 110 80 50 20
[3,] 132 102 72 42 12
[4,] 133 103 73 43 13
[5,] 134 104 74 44 14
[6,] 135 105 75 45 15
[7,] 136 106 76 46 16
[8,] 137 107 77 47 17
[9,] 138 108 78 48 18
[10 ,] 139 109 79 49 19

所以你可以直接抓住:

  matdm<  -  myAgg $ yvar 

现在回答你的具体问题...



1)使矩阵数字的一致方式是确保数据进入 matrix() as.matrix()函数是数字的。当您调用



matm = as.matrix(aggregate(yvar〜genotypes,mydf,'c'))



您创建了一个字符矩阵,因为您有一个字符列。然后将该矩阵转换为data.frame。这将列转换为因素。那么你选择了几列,这并不奇怪,仍然是因素。所以当你调用



matdm< - as.matrix(matx)



将因子转换为字符。



2)由



聚合(yvar〜genotypes)创建的变量的顺序,mydf,'c')



是变量基因型中的因子顺序的函数, code>。这些通常是按字母顺序创建的,但您可以随时查看级别,以便完全确定。如果因素是手动创建的,则不一定按字母顺序排列。


I have the following data:

yvar <- c(1:150)
replication <- c( rep(c(rep(1, 10), rep(2,10), rep(3,10)),5))
genotypes <- c(rep(paste("G", 1:10, sep= ""), 15))
environments <- c(rep(paste("E",5:1, sep = ""), each = 30))
mydf1 <- data.frame (yvar, replication, genotypes, environments)
mydf1$replication <- as.factor(mydf1$replication)

I want to summarize data:

mydf = data.frame(aggregate (yvar ~ genotypes + environments, data = mydf1, mean))

Now create a matrix, hopefully numeric, matm is not !

matm = as.matrix(aggregate(yvar ~ genotypes, mydf, 'c'))
colnames(matm) <- c("genotypes", levels(mydf$environments))
      genotypes E1    E2    E3    E4    E5   
 [1,] "G1"      "131" "101" " 71" " 41" " 11"
 [2,] "G10"     "140" "110" " 80" " 50" " 20"
 [3,] "G2"      "132" "102" " 72" " 42" " 12"
 [4,] "G3"      "133" "103" " 73" " 43" " 13"
 [5,] "G4"      "134" "104" " 74" " 44" " 14"
 [6,] "G5"      "135" "105" " 75" " 45" " 15"
 [7,] "G6"      "136" "106" " 76" " 46" " 16"
 [8,] "G7"      "137" "107" " 77" " 47" " 17"
 [9,] "G8"      "138" "108" " 78" " 48" " 18"
[10,] "G9"      "139" "109" " 79" " 49" " 19" 

I converted to data.frame, then

    matd <- data.frame(matm)

  genotypes       E1       E2       E3       E4       E5
1         G1 31.70000 26.76667 23.60000 30.73333 43.13333
2        G10 32.40000 17.86667 28.83333 32.43333 30.23333
3         G2 29.50000 24.60000 24.16667 33.43333 38.66667
4         G3 27.00000 28.83333 33.63333 43.83333 29.60000
5         G4 29.53333 29.90000 26.60000 26.13333 40.33333
6         G5 27.40000 32.43333 27.96667 40.43333 41.46667
7         G6 36.76667 32.26667 28.26667 38.73333 33.43333
8         G7 29.63333 27.00000 26.96667 34.90000 40.70000
9         G8 24.50000 23.26667 22.50000 27.60000 32.26667
10        G9 31.60000 24.96667 24.46667 27.56667 36.26667

I want to get rid of genotypes column and then convert it to matrix

matx = data.frame(matd[,-1])
matdm <- as.matrix(matx) 
matdm 
      E1         E2         E3         E4         E5        
 [1,] "31.70000" "26.76667" "23.60000" "30.73333" "43.13333"
 [2,] "32.40000" "17.86667" "28.83333" "32.43333" "30.23333"
 [3,] "29.50000" "24.60000" "24.16667" "33.43333" "38.66667"
 [4,] "27.00000" "28.83333" "33.63333" "43.83333" "29.60000"
 [5,] "29.53333" "29.90000" "26.60000" "26.13333" "40.33333"
 [6,] "27.40000" "32.43333" "27.96667" "40.43333" "41.46667"
 [7,] "36.76667" "32.26667" "28.26667" "38.73333" "33.43333"
 [8,] "29.63333" "27.00000" "26.96667" "34.90000" "40.70000"
 [9,] "24.50000" "23.26667" "22.50000" "27.60000" "32.26667"
[10,] "31.60000" "24.96667" "24.46667" "27.56667" "36.26667"

I have two questions:

(1) is there is consistent way to make / assign a matrix numeric

(2) I can see the genotypes column names are sorted alphabetically. My file has different order in the column. I am fine with this order if this is consistent, however I am afraid with the following portion:

colnames(matm) <- c("genotypes", levels(mydf$environments))

If there is different order of the aggregate function and levels(mydf$environments), do they both sort alphabettically or oder in file.

appreciate your suggestion.

解决方案

I think I see where the confusion is coming from. Backing up slightly, when you do the aggregation you want to turn into a matrix; try capturing that and looking at it:

myAgg <- aggregate(yvar ~ genotypes, mydf, 'c')
str(myAgg)

yields:

> str(myAgg)
'data.frame':   10 obs. of  2 variables:
 $ genotypes: Factor w/ 10 levels "G1","G10","G2",..: 1 2 3 4 5 6 7 8 9 10
 $ yvar     : num [1:10, 1:5] 131 140 132 133 134 135 136 137 138 139 ...

So the aggregate produces a somewhat atypical data.frame. The column yvar is actually the matrix you are interested in:

> myAgg$yvar
      [,1] [,2] [,3] [,4] [,5]
 [1,]  131  101   71   41   11
 [2,]  140  110   80   50   20
 [3,]  132  102   72   42   12
 [4,]  133  103   73   43   13
 [5,]  134  104   74   44   14
 [6,]  135  105   75   45   15
 [7,]  136  106   76   46   16
 [8,]  137  107   77   47   17
 [9,]  138  108   78   48   18
[10,]  139  109   79   49   19

so you can grab that directly:

matdm <- myAgg$yvar

Now to answer your specific questions...

1) the consistent way to make a matrix numeric is to ensure that data going into the matrix() or as.matrix() functions are numeric. When you called

matm = as.matrix(aggregate(yvar ~ genotypes, mydf, 'c'))

you created a character matrix because you had a char column. Then you converted that matrix into a data.frame. This converted the columns into factors. Then you selected a few columns which were, not surprisingly, still factors. So when you called

matdm <- as.matrix(matx)

the factors got converted to characters.

2) The order of the variables created by

aggregate(yvar ~ genotypes, mydf, 'c')

is a function of the order of the factors in the variable genotypes. Those are generally created alphabetically, but you can always look at the levels in order to be totally sure. If the factors were created manually they would not necessarily be in alphabetical order.

这篇关于制作矩阵数字和命名的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆