R将列表列表转换为数据框 [英] R convert list of lists to dataframe

查看:832
本文介绍了R将列表列表转换为数据框的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我需要处理密码保护的Excel(xlsx)工作簿中提供的数据.出于法律原因,我无法创建不受保护的Excel文件或csv文件等并从那里进行处理. Excel导入程序包都不能处理受密码保护的工作簿.

I need to process data that is being provided in a password protected Excel (xlsx) workbook. For legal reasons, I cannot create an unprotected Excel file or a csv file etc and process from there. None of the Excel import packages can deal with password protected workbooks.

从此答案中将受密码保护的xlsx工作簿导入R 我设法提取了数据.但是,它以字符列表格式的列表导入.我的列表的Dput看起来像这样:

From this answer Import password-protected xlsx workbook into R I have managed to extract the data. However it is imported in a list of character lists format. The dput of my list looks like this:

list(list("ID", "ID1", "ID2"),
     list("V2", NULL, "text2"),
     list("Name", "John Smith", "Mary Brown"),
     list("Score", 1, 2),
     list("email", "JS@gmail.com", "MB@gov.uk"))

我想要的是一个具有ID,V2等列的数据框,如下所示:

What I want is a dataframe with columns ID, V2 etc that looks like this:

   ID    V2     Name        Score  email
   ID1   NULL   John Smith  1      JS@gmail.com
   ID2   text2  Mary Brown  2      MS@gov.uk

原始Excel工作簿中有空单元格,因此带有unlist的解决方案将不起作用.

There are empty cells in the original Excel workbook, so solutions with unlist will not work.

结合使用 R列表到数据框的答案和其他类似问题,我有以下代码(其中 listform 是列表的名称):

Using a combination of answers from R list to data frame and other similar questions, I have the following code (where listform is the name of the list):

matform <- as.matrix(sapply(listform, function(s) s)) # retains empty
df <- data.frame(matform[2:nrow(matform),])
names(df) = matform[1,]

这是关闭的,但是数据框具有列表作为列.因此str(df)会产生:

This is close, but the dataframe has lists as columns. So str(df) yields:

'data.frame':   2 obs. of  5 variables:
 $ ID:List of 2
  ..$ : chr "ID1"
  ..$ : chr "ID2"
 $ V2:List of 2
  ..$ : NULL
  ..$ : chr "text2"
and so on

推荐答案

"data.table"包中的"SetDT"似乎非常强大:

"SetDT" from the "data.table" package seems to be very powerful:

> library(data.table)

> null2na <- function(x){ ifelse(is.null(x),NA,x)}

> f <- function(x){sapply(x,null2na)}

> L <- list(list("ID", "ID1", "ID2"),
+           list("V2", NULL, "text2"),
+           list("Name", "John Smith", "Mary Brown"),
+           list("S ..." ... [TRUNCATED] 

> L <- setDT(L)[, lapply(.SD, f)]

> setnames(L,colnames(L),unlist(L[1,]))

> L <- L[-1,]

> L
    ID    V2       Name Score        email
1: ID1    NA John Smith     1 JS@gmail.com
2: ID2 text2 Mary Brown     2    MB@gov.uk

> str(L)
Classes ‘data.table’ and 'data.frame':  2 obs. of  5 variables:
 $ ID   : chr  "ID1" "ID2"
 $ V2   : chr  NA "text2"
 $ Name : chr  "John Smith" "Mary Brown"
 $ Score: chr  "1" "2"
 $ email: chr  "JS@gmail.com" "MB@gov.uk"
 - attr(*, ".internal.selfref")=<externalptr> 
> 

(数据表是更好的数据框.)

(A data table is a better data frame.)

函数"f"完成两项工作:取消列表"并将NULL转换为NA.

The function "f" does two jobs: It "unlist"s and turns NULL into NA.

这篇关于R将列表列表转换为数据框的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆