如何从data.frame中删除一行而不丢失属性 [英] How to delete a row from a data.frame without losing the attributes

查看:40
本文介绍了如何从data.frame中删除一行而不丢失属性的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

对于初学者:我现在已经在这个问题上搜索了几个小时 - 所以如果答案应该是微不足道的,请原谅我......

for starters: I searched for hours on this problem by now - so if the answer should be trivial, please forgive me...

我想要做的是从 data.frame 中删除一行(第 101 行).它包含测试数据,不应出现在我的分析中.我的问题是:每当我从 data.frame 中提取子集时,属性(尤其是注释)都会丢失.

What I want to do is delete a row (no. 101) from a data.frame. It contains test data and should not appear in my analyses. My problem is: Whenever I subset from the data.frame, the attributes (esp. comments) are lost.

str(x)
# x has comments for each variable
x <- x[1:100,]
str(x)
# now x has lost all comments

有据可查的子集将删除所有属性 - 到目前为止,这是非常清楚的.手册(例如 http://stat.ethz.ch/R-manual/R-devel/library/base/html/Extract.data.frame.html) 甚至提出了一种保留属性的方法:

It is well documented that subsetting will drop all attributes - so far, it's perfectly clear. The manual (e.g. http://stat.ethz.ch/R-manual/R-devel/library/base/html/Extract.data.frame.html) even suggests a way to preserve the attributes:

## keeping special attributes: use a class with a
## "as.data.frame" and "[" method:


as.data.frame.avector <- as.data.frame.vector

`[.avector` <- function(x,i,...) {
  r <- NextMethod("[")
  mostattributes(r) <- attributes(x)
  r
}

d <- data.frame(i= 0:7, f= gl(2,4),
                u= structure(11:18, unit = "kg", class="avector"))
str(d[2:4, -1]) # 'u' keeps its "unit"

我还没有深入了解 R 来理解这里到底发生了什么.但是,简单地运行这些行(最后三行除外)并不会改变我的子集的行为.将命令子集()与适当的向量(100 次 TRUE + 1 FALSE)一起使用,得到相同的结果.并且简单地将属性存储到变量并在子集之后恢复它,也行不通.

I am not yet so far into R to understand what exactly happens here. However, simply running these lines (except the last three) does not change the behavior of my subsetting. Using the command subset() with an appropriate vector (100-times TRUE + 1 FALSE) gives me the same result. And simply storing the attributes to a variable and restoring it after the subset, does not work, either.

# Does not work...
tmp <- attributes(x)
x <- x[1:100,]
attributes(x) <- tmp

当然,我可以将所有评论写入向量(var=>comment)、子集并使用循环将它们写回 - 但这似乎不是一个有充分根据的解决方案.我很确定我会在未来的分析中遇到具有其他相关属性的数据集.

Of course, I could write all comments to a vector (var=>comment), subset and write them back using a loop - but that does not seem a well-founded solution. And I am quite sure I will encounter datasets with other relevant attributes in future analyses.

所以这就是我在 stackoverflow、Google 和脑力方面的努力陷入困境的地方.如果有人可以帮助我提供提示,我将不胜感激.谢谢!

So this is where my efforts in stackoverflow, Google, and brain power got stuck. I would very much appreciate if anyone could help me out with a hint. Thanks!

推荐答案

如果我理解正确的话,你在 data.frame 中有一些数据,并且 data.frame 的列有与之相关的注释.也许类似于以下内容?

If I understand you correctly, you have some data in a data.frame, and the columns of the data.frame have comments associated with them. Perhaps something like the following?

set.seed(1)

mydf<-data.frame(aa=rpois(100,4),bb=sample(LETTERS[1:5],
  100,replace=TRUE))

comment(mydf$aa)<-"Don't drop me!"
comment(mydf$bb)<-"Me either!"

所以这会给你类似的东西

So this would give you something like

> str(mydf)
'data.frame':   100 obs. of  2 variables:
 $ aa: atomic  3 3 4 7 2 7 7 5 5 1 ...
  ..- attr(*, "comment")= chr "Don't drop me!"
 $ bb: Factor w/ 5 levels "A","B","C","D",..: 4 2 2 5 4 2 1 3 5 3 ...
  ..- attr(*, "comment")= chr "Me either!"

当你对它进行子集化时,评论会被删除:

And when you subset this, the comments are dropped:

> str(mydf[1:2,]) # comment dropped.
'data.frame':   2 obs. of  2 variables:
 $ aa: num  3 3
 $ bb: Factor w/ 5 levels "A","B","C","D",..: 4 2

要保留注释,请定义函数 [.avector,就像您在上面所做的那样(来自文档),然后将适当的类属性添加到 data.frame 中的每一列(EDIT:为了保持bb的因子水平,将"factor"添加到bb的类中.):

To preserve the comments, define the function [.avector, as you did above (from the documentation) then add the appropriate class attributes to each of the columns in your data.frame (EDIT: to keep the factor levels of bb, add "factor" to the class of bb.):

mydf$aa<-structure(mydf$aa, class="avector")
mydf$bb<-structure(mydf$bb, class=c("avector","factor"))

以便保留评论:

> str(mydf[1:2,])
'data.frame':   2 obs. of  2 variables:
 $ aa:Class 'avector'  atomic [1:2] 3 3
  .. ..- attr(*, "comment")= chr "Don't drop me!"
 $ bb: Factor w/ 5 levels "A","B","C","D",..: 4 2
  ..- attr(*, "comment")= chr "Me either!"

如果您的 data.frame 中有许多列具有您想要保留的属性,您可以使用 lapply(EDITED 以包含原始列类):

If there are many columns in your data.frame that have attributes you want to preserve, you could use lapply (EDITED to include original column class):

mydf2 <- data.frame( lapply( mydf, function(x) {
  structure( x, class = c("avector", class(x) ) )
} ) )

然而,这会丢弃与 data.frame 本身相关的注释(例如 comment(mydf)<-"I'm a data.frame"),因此如果您有任何注释,请分配他们到新的 data.frame:

However, this drops comments associated with the data.frame itself (such as comment(mydf)<-"I'm a data.frame"), so if you have any, assign them to the new data.frame:

comment(mydf2)<-comment(mydf)

然后你有

> str(mydf2[1:2,])
'data.frame':   2 obs. of  2 variables:
 $ aa:Classes 'avector', 'numeric'  atomic [1:2] 3 3
  .. ..- attr(*, "comment")= chr "Don't drop me!"
 $ bb: Factor w/ 5 levels "A","B","C","D",..: 4 2
  ..- attr(*, "comment")= chr "Me either!"
 - attr(*, "comment")= chr "I'm a data.frame"

这篇关于如何从data.frame中删除一行而不丢失属性的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆