将Rdata文件转换为CSV - data.frame参数中的错误表示不同行数 [英] Converting Rdata files to CSV - Error in data.frame arguments imply differing number of rows

查看:4444
本文介绍了将Rdata文件转换为CSV - data.frame参数中的错误表示不同行数的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试使用此答案中的R代码将一堆rdata文件转换为CSV。 / p>

  resave<  -  function(file){
e < - new.env(parent = emptyenv())
load(file,envir = e)
objs < - ls(envir = e,all.names = TRUE)
(obj在objs中){
.x< - get(obj,envir = e)
message(sprintf('将%s保存为%s.csv',obj,obj))
write.csv(.x,file = paste0(obj, '.csv'))
}
}

resave('yourData.RData')

但是在其中一个文件中,我收到此错误:

  data.frame(`2` = list(pos = c(6506L,6601L,21801L,21811L,21902L,:
arguments表示不同行数:7670,9729,114,2422
调用:重新保存... as.data.frame - > as.data.frame.list - > eval - > eval - > data.frame

我尝试搜索错误消息,但我不能真的做头或尾巴。



那个rdata文件是否创建不正确?



有没有更好的方法我可以将任意的Rdata文件转换成CSV? (我不知道文件内的文件名称提前提交)。



更新:



这是我在该rdata文件中看到的。如果有任何帮助? (请记住,我无法真正编辑rdata文件,所以我试图找出一些将它们转换为CSV的东西。)

 > load(indiv8-hmmprob.RData)
> ls()
[1]dataa
> write.csv(dataa,file =greg.csv)
data.frame中的错误(`2` = list(pos = c(6506L,6601L,21801L,21811L,21902L,
arguments意味着行数不同:7670,9729,114,2422
> name(dataa)
[1]234X
> str dataa)
4
$ 2的列表:data.frame:7670个obs。的23个变量:
.. $ pos:int [1:7670] 6506 6601 21801 21811 21902 21931 22487 24071 26674 26713 ...
.. $ ref:chr [1:7670]CAGA...
.. $ cons:chr [1: 7670]TTAG...
.. $阅读:chr [1:7670]tttttttAAAAAGGGGG...
.. $ quals:chr [1:7670]FBBIIIBIFIIIFFIII...
.. $ A:int [1:7670] 0 0 5 0 0 0 1 0 0 1 ...
.. $ C:int [1:7670] 0 0 0 0 0 0 0 0 2 0 ...
.. $ G:int [1:7670] 0 0 0 5 11 0 0 0 0 0 ...
.. $ T:int [1:7670] 3 4 0 0 0 10 0 2 0 0 ...
.. $ N :int [1:7670] 0 0 0 0 0 0 0 0 0 0 ...
.. $ bad:chr [1:7670] NA NA NA NA ...
.. $ par1ref :chr [1:7670]CAGA...
.. $ par2ref:chr [1:7670]TTAG ..
.. $阅读:因素w / 8397级别1,2,3,4,...:2 2 3 3 3 3 4 7 9 9 ...
.. $ count:int [1:7670] 3 4 5 5 11 10 1 2 2 1 ...
.. $ read_allele:chr [1:7670]TTA G...
.. $ Pr(y | par1 / par1):num [1:7670] 9.30e-04 5.69e-04 3.47e-04 1.42e-04 1.90e-08 ...
.. $ Pr(y | par1 / par2): num [1:7670] 4.58e-02 1.64e-02 2.41e-03 4.09e-03 8.89e-07 ...
.. $ Pr(y | par2 / par2):num [1:7670 ] 1.61e-01 8.40e-02 8.94e-03 2.09e-02 3.29e-06 ...
.. $ est:int [1:7670] 3 3 3 3 3 3 3 3 3 3。 ..
.. $ Pr(par1 / par1 | y):num [1:7670] 4.67e-25 2.25e-27 1.98e-31 2.93e-32 2.82e-34 ...
.. $ Pr(par1 / par2 | y):num [1:7670] 2.95e-11 2.86e-11 2.49e-14 1.98e-14 1.08e-14 ...
.. $ Pr (par2 / par2 | y):num [1:7670] 1 1 1 1 1 ...
..- attr(*,badpos)= int [1:11386] 21900 21905 22840 24029 27149 27170 28024 42187 46927 46990 ...
$ 3:'data.frame':9729 obs。 23个变量:
.. $ pos:int [1:9729] 6001 22537 25304 27228 28817 28842 30540 48903 48938 48943 ...
.. $ ref:chr [1:9729]A AAC...
.. $ cons:chr [1:9729]AGTC...
.. $ :chr [1:9729],GGGTTTTT,...
.. $ quals:chr [1:9729]FBBBBFFFFB ..
.. $ A:int [1:9729] 1 0 0 0 0 0 0 0 0 0 ...
.. $ C:int [1:9729] 0 0 0 1 1 0 0 0 0 1 ...
.. $ G:int [1:9729] 0 3 0 0 0 0 0 0 0 0 ...
.. $ T:int [1:9729 ] 0 0 5 0 0 1 1 1 1 0 ...
.. $ N:int [1:9729] 0 0 0 0 0 0 0 0 0 0 ...
.. $ bad :chr [1:9729] NA NA NA NA ...
.. $ par1ref:chr [1:9729]AAAC...
.. $ par2ref:chr [1:9729]GGTT...
.. $ read:因子w / 10640级别1,2,3 4,..:1 3 4 5 7 7 8 10 10 10 ...
.. $ count:int [1:9729] 1 3 5 1 1 1 1 1 1 1 ...
.. $ read_allele:chr [1:9729]A GTC...
.. $ Pr(y | par1 / par1):num [1:9729] 0.969856 0.002707 0.000372 0.969639 0.969856 ...
.. $ Pr(y | par1 / par2):num [1:9729] 0.48995 0.0567 0.00228 0.48988 0.48995 ...
.. $ Pr(y | par2 / par2):num [1:9729] 0.01005 0.26071 0.00798 0.01012 0.01005 ...
.. $ est:int [1:9729] 1 3 3 1 1 1 1 3 1 3 ...
.. $ Pr(par1 / par1 | y):num [1:9729] 2.18e-10 2.82e-11 2.67e-11 2.65e-11 2.63e-11。 ..
.. $ Pr(par1 / par2 | y):num [1:9729] 0.688 0.688 0.688 0.688 0.688 ...
.. $ Pr(par2 / par2 | y):num [ 1:9729] 0.312 0.312 0.312 0.312 0.312 ...
..- attr(*,badpos)= int [1:13707] 25259 27250 27810 27880 27888 28836 30507 48975 55998 58734 ...
$ 4:'data.frame':114 obs。 23个变量:
.. $ pos:int [1:114] 21119 21194 42177 64136 64146 74463 74465 74521 79860 79884 ...
.. $ ref:chr [1:114]T TCC...
.. $ cons:chr [1:114]CAYY...
.. $ :chr [1:114]cCCCCCCCCCCCCCccaaTT...
.. $ quals:chr [1:114]IBFFBFBFFFFFFBBFFFFI ..
.. $ A:int [1:114] 0 2 0 0 0 0 0 0 2 0 ...
.. $ C:int [1:114] 16 0 0 0 1 0 1 1 0 0 ...
.. $ G:int [1:114] 0 0 0 0 0 0 0 0 0 2 ...
.. $ T:int [1:114 ] 0 0 1 1 0 1 0 0 0 0 ...
.. $ N:int [1:114] 0 0 0 0 0 0 0 0 0 0 ...
.. $ bad :chr [1:114] NA NA NA NA ...
.. $ par1ref:chr [1:114]TTCC...
.. $ par2ref:chr [1:114]CATT...
.. $ read:因子w / 130级别1,2,3 4,..:3 3 6 8 8 10 10 10 14 14 ...
.. $ count:int [1:114] 16 2 1 1 1 1 1 2 2 ...
.. $ read_allele:chr [1:114] CATT...
.. $ Pr(y | par1 / par1):num [1:114] 9.34e-12 4.99e-03 1.00e-02 1.00e-02 1.00e-02 ...
.. $ Pr(y | par1 / par2): num [1:114] 4.56e-10 2.33e-01 4.90e-01 4.90e-01 4.90e-01 ...
.. $ Pr(y | par2 / par2):num [1:114 ] 9.04e-10 8.61e-01 9.70e-01 9.70e-01 9.70e-01 ...
.. $ est:int [1:114] 3 3 3 3 3 3 3 3 3 3。 ..
.. $ Pr(par1 / par1 | y):num [1:114] 6.50e-24 4.49e-24 1.10e-26 2.53e-31 1.51e-31 ...
.. $ Pr(par1 / par2 | y):num [1:114] 1.56e-10 1.54e-10 5.77e-11 6.60e-12 6.59e-12 ...
.. $ Pr (par2 / par2 | y):num [1:114] 1 1 1 1 1 ...
..- attr(*,badpos)= int [1:73] 16621 16638 34177 34180 74448 74464 78954 79664 80045 94170 ...
$ X:'data.frame':2422 obs。的23个变量:
.. $ pos:int [1:2422] 34630 45427 70728 70744 166279 189892 207276 207424 213012 232229 ...
.. $ ref:chr [1:2422]T GGC...
.. $ cons:chr [1:2422]TGGC...
.. $ :chr [1:2422]a...^ F。 。 ...
.. $ quals:chr [1:2422]< IIFFB...
.. $ A:int [1:2422] 1 0 0 0 0 0 0 4 0 1 ...
.. $ C:int [1:2422] 0 0 0 1 1 0 2 0 0 0 ...
.. $ G:int [1:2422] 0 3 1 0 0 1 0 1 1 0 ...
.. $ T:int [1:2422] 0 0 0 0 0 0 0 0 0 0 ...
.. $ N:int [1:2422] 0 0 0 0 0 0 0 0 0 0。 ..
.. $ bad:chr [1:2422] NA NA NA NA ...
.. $ par1ref:chr [1:2422]TGGC ...
.. $ par2ref:chr [1:2422]AAAT...
.. $ read:因子w / 2433级别1 ,2,3,4,...:1 6 8 8 13 16 18 18 19 20 ...
.. $ count:int [1:2422] 1 3 1 1 1 1 2 5 1 1 ...
.. $ read_allele:chr [1:2422]AGGC...
.. $ Pr(y | par1 / par1):num [1:2422] 0.0105 0.2732 0.9699 0.9696 0.9699 ...
.. $ Pr(y | par1 / par2):num [1:2422] 0.4895 0.0642 0.49 0.4899 0.49 ...
.. $ Pr(y | par2 / par2):num [1:2422] 0.96856 0.00134 0.01005 0.0 1012 0.01005 ...
.. $ est:int [1:2422] 3 1 1 1 1 1 1 1 1 1 ...
.. $ Pr(par1 / par1 | y):num [1:2422] 1 1 1 1 1 ...
.. $ Pr(par1 / par2 | y):num [1:2422] 3.70e-08 2.00e-08 1.06e-08 1.06e- 08 1.59e-09 ...
.. $ Pr(par2 / par2 | y):num [1:2422] 3.70e-18 9.35e-20 2.36e-23 2.23e-23 3.26e-26 ...
..- attr(*,badpos)= int [1:2327] 34776 45619 86591 86607 166220 193151 193159 212997 232221 233552 ...
/ pre>

解决方案

该答案旨在处理class-classata.frame的对象。你只有一个类'''的对象恰好有数据帧的项目。所以在你的工作空间中没有一个名称为2的对象,但是在'dataa'列表中有一个名为2的元素,所有其他元素也显示为数据框,所以为什么不使用:

  lapply(name(dataa),function(nam)write.csv(data [[nam]],file = paste0 (nam,.Rata)))


I'm trying to use the R code from this answer to convert a bunch of rdata files to CSV.

resave <- function(file){
  e <- new.env(parent = emptyenv())
  load(file, envir = e)
  objs <- ls(envir = e, all.names = TRUE)
  for(obj in objs) {
    .x <- get(obj, envir =e)
    message(sprintf('Saving %s as %s.csv', obj,obj) )
    write.csv(.x, file = paste0(obj, '.csv'))
  }
}

  resave('yourData.RData')

However on one of the files I'm getting this error:

Error in data.frame(`2` = list(pos = c(6506L, 6601L, 21801L, 21811L, 21902L,  : 
  arguments imply differing number of rows: 7670, 9729, 114, 2422
Calls: resave ... as.data.frame -> as.data.frame.list -> eval -> eval -> data.frame

I tried searching for the error message but I can't really make heads or tails of it.

Was that rdata file created improperly somehow?

Is there a better way I should convert arbitrary Rdata files to CSV? (I Don't know the names of the objects inside the files ahead of time.)

Update:

Here's what I'm seeing in that rdata file. If it's any help?? (Keep in mind I can't really edit the rdata files so I'm trying to figure out something that will convert them to CSV as is.)

> load("indiv8-hmmprob.RData")
> ls()
[1] "dataa"
> write.csv(dataa, file="greg.csv")
Error in data.frame(`2` = list(pos = c(6506L, 6601L, 21801L, 21811L, 21902L,  : 
  arguments imply differing number of rows: 7670, 9729, 114, 2422
> names(dataa)
[1] "2" "3" "4" "X"
> str(dataa)
List of 4
 $ 2:'data.frame':  7670 obs. of  23 variables:
  ..$ pos              : int [1:7670] 6506 6601 21801 21811 21902 21931 22487 24071 26674 26713 ...
  ..$ ref              : chr [1:7670] "C" "A" "G" "A" ...
  ..$ cons             : chr [1:7670] "T" "T" "A" "G" ...
  ..$ reads            : chr [1:7670] "ttt" "tttt" "AAAAA" "GGGGG" ...
  ..$ quals            : chr [1:7670] "FBB" "IIIB" "IFIII" "FFIII" ...
  ..$ A                : int [1:7670] 0 0 5 0 0 0 1 0 0 1 ...
  ..$ C                : int [1:7670] 0 0 0 0 0 0 0 0 2 0 ...
  ..$ G                : int [1:7670] 0 0 0 5 11 0 0 0 0 0 ...
  ..$ T                : int [1:7670] 3 4 0 0 0 10 0 2 0 0 ...
  ..$ N                : int [1:7670] 0 0 0 0 0 0 0 0 0 0 ...
  ..$ bad              : chr [1:7670] NA NA NA NA ...
  ..$ par1ref          : chr [1:7670] "C" "A" "G" "A" ...
  ..$ par2ref          : chr [1:7670] "T" "T" "A" "G" ...
  ..$ read             : Factor w/ 8397 levels "1","2","3","4",..: 2 2 3 3 3 3 4 7 9 9 ...
  ..$ count            : int [1:7670] 3 4 5 5 11 10 1 2 2 1 ...
  ..$ read_allele      : chr [1:7670] "T" "T" "A" "G" ...
  ..$ Pr(y| par1/par1 ): num [1:7670] 9.30e-04 5.69e-04 3.47e-04 1.42e-04 1.90e-08 ...
  ..$ Pr(y| par1/par2 ): num [1:7670] 4.58e-02 1.64e-02 2.41e-03 4.09e-03 8.89e-07 ...
  ..$ Pr(y| par2/par2 ): num [1:7670] 1.61e-01 8.40e-02 8.94e-03 2.09e-02 3.29e-06 ...
  ..$ est              : int [1:7670] 3 3 3 3 3 3 3 3 3 3 ...
  ..$ Pr( par1/par1 |y): num [1:7670] 4.67e-25 2.25e-27 1.98e-31 2.93e-32 2.82e-34 ...
  ..$ Pr( par1/par2 |y): num [1:7670] 2.95e-11 2.86e-11 2.49e-14 1.98e-14 1.08e-14 ...
  ..$ Pr( par2/par2 |y): num [1:7670] 1 1 1 1 1 ...
  ..- attr(*, "badpos")= int [1:11386] 21900 21905 22840 24029 27149 27170 28024 42187 46927 46990 ...
 $ 3:'data.frame':  9729 obs. of  23 variables:
  ..$ pos              : int [1:9729] 6001 22537 25304 27228 28817 28842 30540 48903 48938 48943 ...
  ..$ ref              : chr [1:9729] "A" "A" "A" "C" ...
  ..$ cons             : chr [1:9729] "A" "G" "T" "C" ...
  ..$ reads            : chr [1:9729] "," "GGG" "TTTTT" "," ...
  ..$ quals            : chr [1:9729] "F" "BBB" "BFFFF" "B" ...
  ..$ A                : int [1:9729] 1 0 0 0 0 0 0 0 0 0 ...
  ..$ C                : int [1:9729] 0 0 0 1 1 0 0 0 0 1 ...
  ..$ G                : int [1:9729] 0 3 0 0 0 0 0 0 0 0 ...
  ..$ T                : int [1:9729] 0 0 5 0 0 1 1 1 1 0 ...
  ..$ N                : int [1:9729] 0 0 0 0 0 0 0 0 0 0 ...
  ..$ bad              : chr [1:9729] NA NA NA NA ...
  ..$ par1ref          : chr [1:9729] "A" "A" "A" "C" ...
  ..$ par2ref          : chr [1:9729] "G" "G" "T" "T" ...
  ..$ read             : Factor w/ 10640 levels "1","2","3","4",..: 1 3 4 5 7 7 8 10 10 10 ...
  ..$ count            : int [1:9729] 1 3 5 1 1 1 1 1 1 1 ...
  ..$ read_allele      : chr [1:9729] "A" "G" "T" "C" ...
  ..$ Pr(y| par1/par1 ): num [1:9729] 0.969856 0.002707 0.000372 0.969639 0.969856 ...
  ..$ Pr(y| par1/par2 ): num [1:9729] 0.48995 0.0567 0.00228 0.48988 0.48995 ...
  ..$ Pr(y| par2/par2 ): num [1:9729] 0.01005 0.26071 0.00798 0.01012 0.01005 ...
  ..$ est              : int [1:9729] 1 3 3 1 1 1 1 3 1 3 ...
  ..$ Pr( par1/par1 |y): num [1:9729] 2.18e-10 2.82e-11 2.67e-11 2.65e-11 2.63e-11 ...
  ..$ Pr( par1/par2 |y): num [1:9729] 0.688 0.688 0.688 0.688 0.688 ...
  ..$ Pr( par2/par2 |y): num [1:9729] 0.312 0.312 0.312 0.312 0.312 ...
  ..- attr(*, "badpos")= int [1:13707] 25259 27250 27810 27880 27888 28836 30507 48975 55998 58734 ...
 $ 4:'data.frame':  114 obs. of  23 variables:
  ..$ pos              : int [1:114] 21119 21194 42177 64136 64146 74463 74465 74521 79860 79884 ...
  ..$ ref              : chr [1:114] "T" "T" "C" "C" ...
  ..$ cons             : chr [1:114] "C" "A" "Y" "Y" ...
  ..$ reads            : chr [1:114] "cCCCCCCCCCCCCCcc" "aa" "T" "T" ...
  ..$ quals            : chr [1:114] "IBFFBFBFFFFFFBBF" "FF" "F" "I" ...
  ..$ A                : int [1:114] 0 2 0 0 0 0 0 0 2 0 ...
  ..$ C                : int [1:114] 16 0 0 0 1 0 1 1 0 0 ...
  ..$ G                : int [1:114] 0 0 0 0 0 0 0 0 0 2 ...
  ..$ T                : int [1:114] 0 0 1 1 0 1 0 0 0 0 ...
  ..$ N                : int [1:114] 0 0 0 0 0 0 0 0 0 0 ...
  ..$ bad              : chr [1:114] NA NA NA NA ...
  ..$ par1ref          : chr [1:114] "T" "T" "C" "C" ...
  ..$ par2ref          : chr [1:114] "C" "A" "T" "T" ...
  ..$ read             : Factor w/ 130 levels "1","2","3","4",..: 3 3 6 8 8 10 10 10 14 14 ...
  ..$ count            : int [1:114] 16 2 1 1 1 1 1 1 2 2 ...
  ..$ read_allele      : chr [1:114] "C" "A" "T" "T" ...
  ..$ Pr(y| par1/par1 ): num [1:114] 9.34e-12 4.99e-03 1.00e-02 1.00e-02 1.00e-02 ...
  ..$ Pr(y| par1/par2 ): num [1:114] 4.56e-10 2.33e-01 4.90e-01 4.90e-01 4.90e-01 ...
  ..$ Pr(y| par2/par2 ): num [1:114] 9.04e-10 8.61e-01 9.70e-01 9.70e-01 9.70e-01 ...
  ..$ est              : int [1:114] 3 3 3 3 3 3 3 3 3 3 ...
  ..$ Pr( par1/par1 |y): num [1:114] 6.50e-24 4.49e-24 1.10e-26 2.53e-31 1.51e-31 ...
  ..$ Pr( par1/par2 |y): num [1:114] 1.56e-10 1.54e-10 5.77e-11 6.60e-12 6.59e-12 ...
  ..$ Pr( par2/par2 |y): num [1:114] 1 1 1 1 1 ...
  ..- attr(*, "badpos")= int [1:73] 16621 16638 34177 34180 74448 74464 78954 79664 80045 94170 ...
 $ X:'data.frame':  2422 obs. of  23 variables:
  ..$ pos              : int [1:2422] 34630 45427 70728 70744 166279 189892 207276 207424 213012 232229 ...
  ..$ ref              : chr [1:2422] "T" "G" "G" "C" ...
  ..$ cons             : chr [1:2422] "T" "G" "G" "C" ...
  ..$ reads            : chr [1:2422] "a" "..." "^F." "." ...
  ..$ quals            : chr [1:2422] "<" "IIF" "F" "B" ...
  ..$ A                : int [1:2422] 1 0 0 0 0 0 0 4 0 1 ...
  ..$ C                : int [1:2422] 0 0 0 1 1 0 2 0 0 0 ...
  ..$ G                : int [1:2422] 0 3 1 0 0 1 0 1 1 0 ...
  ..$ T                : int [1:2422] 0 0 0 0 0 0 0 0 0 0 ...
  ..$ N                : int [1:2422] 0 0 0 0 0 0 0 0 0 0 ...
  ..$ bad              : chr [1:2422] NA NA NA NA ...
  ..$ par1ref          : chr [1:2422] "T" "G" "G" "C" ...
  ..$ par2ref          : chr [1:2422] "A" "A" "A" "T" ...
  ..$ read             : Factor w/ 2433 levels "1","2","3","4",..: 1 6 8 8 13 16 18 18 19 20 ...
  ..$ count            : int [1:2422] 1 3 1 1 1 1 2 5 1 1 ...
  ..$ read_allele      : chr [1:2422] "A" "G" "G" "C" ...
  ..$ Pr(y| par1/par1 ): num [1:2422] 0.0105 0.2732 0.9699 0.9696 0.9699 ...
  ..$ Pr(y| par1/par2 ): num [1:2422] 0.4895 0.0642 0.49 0.4899 0.49 ...
  ..$ Pr(y| par2/par2 ): num [1:2422] 0.96856 0.00134 0.01005 0.01012 0.01005 ...
  ..$ est              : int [1:2422] 3 1 1 1 1 1 1 1 1 1 ...
  ..$ Pr( par1/par1 |y): num [1:2422] 1 1 1 1 1 ...
  ..$ Pr( par1/par2 |y): num [1:2422] 3.70e-08 2.00e-08 1.06e-08 1.06e-08 1.59e-09 ...
  ..$ Pr( par2/par2 |y): num [1:2422] 3.70e-18 9.35e-20 2.36e-23 2.23e-23 3.26e-26 ...
  ..- attr(*, "badpos")= int [1:2327] 34776 45619 86591 86607 166220 193151 193159 212997 232221 233552 ...

解决方案

That answer was designed to handle object of class-'data.frame'. You only have an object of class-'list' which happens to have items that are dataframes. So there isn't an object with the name "2" in you workspace but there is an element in the 'dataa'-list that is named "2" and all of the other elements appear to also be dataframes, so why not use:

lapply( names(dataa), function(nam) write.csv( data[[nam]], file=paste0(nam, ".Rdata") ) )

这篇关于将Rdata文件转换为CSV - data.frame参数中的错误表示不同行数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆