R - 从data.frame中舍弃双精度向量 [英] R - Exctracting a vector of doubles from data.frame
问题描述
我有这个问题使用 read.table()
有或没有 header = T
,试图提取从 data.frame
与 as.double(as.character())
(见?factor
)。
但这只是如何我意识到,逻辑。因此,您不会看到 read.table
在下面的代码中,只有必要的部分。
-
使用
header = T
等价:(a< - data.frame(array(c(0.5,0.5,0.5,0.5) ),c(1,4)))
as.character(a)
#[1]0.50.50.50.5
-
没有
header = T
等效:
<$ c $ p>b < - data.frame(array(c(a,0.5,b,0.5,c,0.5,d 0.5,c(2,4)))
(a <-b [2,])
as.character(a)
# 11
(a< - data.frame(a,row.names = NULL))#现在甚至没有视觉差别
as.character(a)
#[1]1111
问题出在 data.frame
的默认设置,选项 stringsAsFactors
设置为 TRUE
。这是您的场景中的一个问题,因为当您使用 header = FALSE
时,该行中的字符值的存在将整个列强制为字符,然后将其转换为因子除非您设置 stringsAsFactors = FALSE
)。
以下是一些要玩的例子:
##两个类似的`data.frame`s - 只有一个参数不同
b< - data.frame (c(a,0.5,b,0.5,c,0.5,d,0.5),c(2,4)))
b2 < c(a,0.5,b,0.5,c,0.5,d,0.5),c(2,4)),
stringsAsFactors = FALSE)
##首先与b
as.character(b [2,])
#[1]1111
sapply(b [2,],as.character)
#X1 X2 X3 X4
#0.50.50.50.5
asmatrix b)[2,]
#X1 X2 X3 X4
#0.50.50.50.5
as.double(asmatrix(b)[2, )
#[1] 0.5 0.5 0.5 0.5
##现在用b2
as.character(b2 [2,])
#[1]0.50.50.50.5
as.double(as.character(b2 [2,]))
#[1] 0.5 0.5 0.5 0.5
I got this question using read.table()
with or without header=T
, trying to extract a vector of doubles from the resulting data.frame
with as.double(as.character())
(see ?factor
).
But that's just how I realized that I don't understand R's logic. So you won't see e.g. read.table
in the code below, only the necessary parts. Could you tell me what's the difference between the following options?
With
header=T
equivalent:(a <- data.frame(array(c(0.5,0.5,0.5,0.5), c(1,4)))) as.character(a) # [1] "0.5" "0.5" "0.5" "0.5"
Without
header=T
equivalent:b <- data.frame(array(c("a",0.5,"b",0.5,"c",0.5,"d",0.5), c(2,4))) (a <- b[2,]) as.character(a) # [1] "1" "1" "1" "1" (a <- data.frame(a, row.names=NULL)) # now there's not even a visual difference as.character(a) # [1] "1" "1" "1" "1"
The problem lies in the default setting of data.frame
, where one of the options, stringsAsFactors
is set to TRUE
. This is a problem in your scenario because when you use header = FALSE
, the presence of character values in that row coerces the entire column to characters, which is then converted to factors (unless you set stringsAsFactors = FALSE
).
Here are some examples to play with:
## Two similar `data.frame`s -- just one argument different
b <- data.frame(array(c("a",0.5,"b",0.5,"c",0.5,"d",0.5), c(2,4)))
b2 <- data.frame(array(c("a",0.5,"b",0.5,"c",0.5,"d",0.5), c(2,4)),
stringsAsFactors = FALSE)
## First with "b"
as.character(b[2, ])
# [1] "1" "1" "1" "1"
sapply(b[2, ], as.character)
# X1 X2 X3 X4
# "0.5" "0.5" "0.5" "0.5"
as.matrix(b)[2, ]
# X1 X2 X3 X4
# "0.5" "0.5" "0.5" "0.5"
as.double(as.matrix(b)[2, ])
# [1] 0.5 0.5 0.5 0.5
## Now with "b2"
as.character(b2[2, ])
# [1] "0.5" "0.5" "0.5" "0.5"
as.double(as.character(b2[2, ]))
# [1] 0.5 0.5 0.5 0.5
这篇关于R - 从data.frame中舍弃双精度向量的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!