ddply错误的含义:'名称'属性[9]的长度必须与向量[1]的长度相同 [英] meaning of ddply error: 'names' attribute [9] must be the same length as the vector [1]

查看:212
本文介绍了ddply错误的含义:'名称'属性[9]的长度必须与向量[1]的长度相同的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在经历面向黑客的机器学习,而我一直停留在这一行:

I'm going through Machine Learning for Hackers, and I am stuck at this line:

from.weight <- ddply(priority.train, .(From.EMail), summarise, Freq = length(Subject))

哪个会产生以下错误:

Error in attributes(out) <- attributes(col) : 
  'names' attribute [9] must be the same length as the vector [1]

这是traceback():

This is a traceback():

> traceback()
11: FUN(1:5[[1L]], ...)
10: lapply(seq_len(n), extract_col_rows, df = x, i = i)
9: extract_rows(x$data, x$index[[i]])
8: `[[.indexed_df`(pieces, i)
7: pieces[[i]]
6: function (i) 
   {
       piece <- pieces[[i]]
       if (.inform) {
           res <- try(.fun(piece, ...))
           if (inherits(res, "try-error")) {
               piece <- paste(capture.output(print(piece)), collapse = "\n")
               stop("with piece ", i, ": \n", piece, call. = FALSE)
           }
       }
       else {
           res <- .fun(piece, ...)
       }
       progress$step()
       res
   }(1L)
5: .Call("loop_apply", as.integer(n), f, env)
4: loop_apply(n, do.ply)
3: llply(.data = .data, .fun = .fun, ..., .progress = .progress, 
       .inform = .inform, .parallel = .parallel, .paropts = .paropts)
2: ldply(.data = pieces, .fun = .fun, ..., .progress = .progress, 
       .inform = .inform, .parallel = .parallel, .paropts = .paropts)
1: ddply(priority.train, .(From.EMail), summarise, Freq = length(Subject))

priority.train对象是一个数据帧,这是更多信息:

The priority.train object is a data frame, and here is more info:

> mode(priority.train)
[1] "list"
> names(priority.train)
[1] "Date"       "From.EMail" "Subject"    "Message"    "Path"      
> sapply(priority.train, mode)
       Date  From.EMail     Subject     Message        Path 
     "list" "character" "character" "character" "character" 
> sapply(priority.train, class)
$Date
[1] "POSIXlt" "POSIXt" 

$From.EMail
[1] "character"

$Subject
[1] "character"

$Message
[1] "character"

$Path
[1] "character"

> length(priority.train)
[1] 5
> nrow(priority.train)
[1] 1250
> ncol(priority.train)
[1] 5
> str(priority.train)
'data.frame':   1250 obs. of  5 variables:
 $ Date      : POSIXlt, format: "2002-01-31 22:44:14" "2002-02-01 00:53:41" "2002-02-01 02:01:44" "2002-02-01 10:29:23" ...
 $ From.EMail: chr  "removed@removed.ca" "removed@removed.net" "removed@removed.ca" "removed@removed.net" ...
 $ Subject   : chr  "please help a newbie compile mplayer :-)" "re: please help a newbie compile mplayer :-)" "re: please help a newbie compile mplayer :-)" "re: please help a newbie compile mplayer :-)" ...
 $ Message   : chr  "    \n Hello,\n   \n         I just installed redhat 7.2 and I think I have everything \nworking properly.  Anyway I want to in"| __truncated__ "Make sure you rebuild as root and you're in the directory that you\ndownloaded the file.  Also it might complain of a few depen"| __truncated__ "Lance wrote:\n\n>Make sure you rebuild as root and you're in the directory that you\n>downloaded the file.  Also it might compl"| __truncated__ "Once upon a time, rob wrote :\n\n>  I dl'd gcc3 and libgcc3, but I still get the same error message when I \n> try rpm --rebuil"| __truncated__ ...
 $ Path      : chr  "../03-Classification/data/easy_ham/01061.6610124afa2a5844d41951439d1c1068" "../03-Classification/data/easy_ham/01062.ef7955b391f9b161f3f2106c8cda5edb" "../03-Classification/data/easy_ham/01063.ad3449bd2890a29828ac3978ca8c02ab" "../03-Classification/data/easy_ham/01064.9f4fc60b4e27bba3561e322c82d5f7ff" ...
Warning messages:
1: In encodeString(object, quote = "\"", na.encode = FALSE) :
  it is not known that wchar_t is Unicode on this platform
2: In encodeString(object, quote = "\"", na.encode = FALSE) :
  it is not known that wchar_t is Unicode on this platform

我会发布一个示例,但是内容有点长,我认为这里的内容不相关.

I would post a sample, but the content is a bit long and I don't think the content is relevant here.

同样的错误也会在这里发生:

The same error also happens here:

> ddply(priority.train, .(Subject))
Error in attributes(out) <- attributes(col) : 
  'names' attribute [9] must be the same length as the vector [1]

有人对这里发生的事情有任何线索吗?该错误似乎是由与priority.train不同的对象产生的,因为它的names属性显然具有9个元素.

Does anyone have a clue on what's going on here? The error seems to be generated by a different object than priority.train, because its names attribute apparently has 9 elements.

我将不胜感激.谢谢!

问题已解决

由于@ user1317221_G使用dput函数的提示,我已经找到了问题.问题在于日期字段,这是一个包含9个字段(秒,分钟,小时,mday,mon,年,wday,yday和isstst)的列表.为了解决这个问题,我只是将日期转换为字符向量,使用ddply然后将日期转换回Date:

I've found the problem thanks to @user1317221_G's tip of using the dput function. The problem is with the Date field, which is at this point a list that contains 9 fields (sec, min, hour, mday, mon, year, wday, yday, isdst). To solve the problem I've simply converted the dates into character vectors, used ddply then converted the dates back to Date:

> tmp <- priority.train$Date
> priority.train$Date <- as.character(priority.train$Date)
> from.weight <- ddply(priority.train, .(From.EMail), summarise, Freq = length(Subject))
> priority.train$Date <- tmp
> rm(tmp)

推荐答案

我通过像Hadley所建议的那样,将格式从POSIXlt转换为POSIXct,解决了我遇到的问题-一行代码:

I fixed this problem I was having by converting format from POSIXlt to POSIXct as Hadley suggests above - one line of code:

    mydata$datetime<-strptime(mydata$datetime, "%Y-%m-%d %H:%M:%S") # original conversion from datetime string : > class(mydata$datetime) [1] "POSIXlt" "POSIXt" 
    mydata$datetime<-as.POSIXct(mydata$datetime) # convert to POSIXct to use in data frames / ddply

这篇关于ddply错误的含义:'名称'属性[9]的长度必须与向量[1]的长度相同的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆