链接数据框架和矩阵 [英] Link data.frame and matrix

查看:74
本文介绍了链接数据框架和矩阵的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个 data.frame 和一个矩阵,具有相同的行和不同的列数。

I have a data.frame and a matrix with same row and different number of columns.

矩阵中的所有元素都是整数,但$ code> data.frame 在一些列中包含字符。

All elements in the matrix are integer but the data.frame includes character in some columns.

我想链接这些文件的行,即如果我删除矩阵中的一行将自动删除 data.frame 中的同一行,当我对 data.frame 其列之一,矩阵中的元素被相应地排序。

I want to link the rows of these file, i.e. if if I delete a row in the matrix the same row in the data.frame be deleted automatically or when I sort the elements of data.frame with one of its column, the elements in the matrix be sorted accordingly.

添加注释:我想保留矩阵作为整数矩阵,所以我不能使用 cbind

Added note: I want to keep the matrix as integer matrix so I can not use cbind.

推荐答案

(至少)有两个解决方案。简单的选择是创建一个新的 data.frame ,其中包括这两行:

There are (at least) two solutions to this. The easy option is to make a new data.frame which includes both rows as such:

样本数据



Sample data

set.seed(123)
df <- data.frame(ID = 1:26, Group = sample(c("A", "B"), 26, TRUE))
mat <- matrix(rnorm(78), ncol = 3, dimnames = list(1:26, paste0("Val", 1:3)))

创建新的 data.frame ,存储矩阵列的名称供以后参考:

Make new data.frame, storing names of matrix columns for later reference:

new_df <- cbind(df, mat)
mat_cols <- colnames(mat)

做一些子集:

new_df <- new_df[seq(1, 25, 2), ]

需要时提取矩阵:

as.matrix(new_df[, mat_cols])

另一个选项是使用S3或S4类。 Bioconductor包 Biobase 具有例如一个 ExpressionSet 类,可以容纳矩阵和表型数据,子集用于子集(尽管矩阵具有相反的行和列)。

The other option is to use an S3 or S4 class. The Bioconductor package Biobase has, for example, an ExpressionSet class which can hold a matrix and phenotype data, and subsetting works to subset both (though the matrix has the rows and columns the opposite way round).

如果你想这样做更简单( ExpressionsSet 可以相对复杂,让你的头脑),这里是一个S3实现:

If you wanted to do that more simply (ExpressionsSets can be relatively complex to get your head around), here's an S3 implementation:

as.JoinedUp <- function(data_frame, matrix) {
  stopifnot(is.data.frame(data_frame), is.matrix(matrix), nrow(data_frame) == nrow(matrix))
  x <- list(data_frame = data_frame, matrix = matrix)
  class(x) <- "JoinedUp"
  x
}
`[.JoinedUp` <- function(x, i = NULL, j = NULL) {
  if (is.null(i)) {
    i <- 1:nrow(x$data_frame)
  }
  if (is.null(j)) {
    j <- union(colnames(x$data_frame), colnames(x$matrix))
  }
  stopifnot(is.character(j))
  x$data_frame <- x$data_frame[i, intersect(j, colnames(x$data_frame)), drop = FALSE]
  x$matrix <- x$matrix[i, intersect(j, colnames(x$matrix)), drop = FALSE]
  x
}
`[<-.JoinedUp` <- function(x, i = NULL, j = NULL, value) {
  if (is.null(j)) {
    j <- union(colnames(x$data_frame), colnames(x$matrix))
  }
  if (is.null(i)) {
    i <- 1:nrow(x$data_frame)
  }
  stopifnot(is.character(j))
  if (!is.matrix(value) & !is.data.frame(value)) {
    value <- as.data.frame(t(value), stringsAsFactors = FALSE)
  }
  stopifnot(ncol(value) == length(j))
  if (any(j %in% colnames(x$data_frame))) {
    df_cols <- intersect(j, colnames(x$data_frame))
    x$data_frame[i, df_cols] <- value[, match(df_cols, j)]
  }
  if (any(j %in% colnames(x$matrix))) {
    mat_cols <- intersect(j, colnames(x$matrix))
    x$matrix[i, mat_cols] <- data.matrix(value[, match(mat_cols, j)])
  }
  x
}

示例:

new_obj <- as.JoinedUp(df, mat)
new_obj[1:3, ]
new_obj[, c("ID", "Val1")]
new_obj[10:15, ]$matrix
new_obj <- new_obj[order(new_obj$matrix[, "Val1"]), ]
new_obj[1:5, c("ID", "Val1")] <- data.frame(ID = 20:24, Val1 = 0)

这只是你需要的骨架;您可能还需要定义 dim nrow ncol 等。

This is only a skeleton of what you'd need; you'd probably also want to define methods for dim, nrow, ncol, etc.

这篇关于链接数据框架和矩阵的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆