转置JSON字典列表以在R中进行分析 [英] Transposing JSON list-of-dictionaries for analysis in R

查看:102
本文介绍了转置JSON字典列表以在R中进行分析的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我将每个实验的实验数据都表示为键值对的字典.一组相关实验被序列化为JSON中这些字典的列表.这可以通过rjson包在R中解析,但是数据以难以分析的形式加载

I have experimental data expressed as dicts of key-value pairs for each experiment. A set of related experiments is serialized as a list of these dicts in JSON. This is parseable in in R via the rjson package, but the data is loaded in a form which is challenging to analyze

data <- fromJSON('[{"k1":"v1","k2":"v2"}, {"k1":"v3","k2":"v4"}]')

收益

[[1]]
[[1]]$k1
[1] "v1"

[[1]]$k2
[1] "v2"


[[2]]
[[2]]$k1
[1] "v3"

[[2]]$k2
[1] "v4"

尝试使用as.data.frame(data)直接将其转换为data.frame会产生:

Attempting to turn this into a data.frame directly with as.data.frame(data) yields:

  k1 k2 k1.1 k2.1
1 v1 v2   v3   v4

清楚地将所有实验中的键/值对的序列视为一维一维列表.

clearly viewing the the sequence of key/value pairs across all experiments as a flat 1-dimensional list.

我想要的是一个更常规的表,每个实验一行,每个唯一键各一列:

What I want is a more conventional table with a row for each experiment, and a column for each unique key:

  k1 k2
1 v1 v2
2 v3 v4

如何在R中清晰地表达这种变换?

How can I cleanly express this transform in R?

推荐答案

在进行列表处理时,l*ply函数可能是最好的朋友.试试这个:

The l*ply functions can be your best friend when doing with list processing. Try this:

> library(plyr)
> ldply(data, data.frame)
  k1 k2
1 v1 v2
2 v3 v4

plyr在幕后进行了一些非常出色的处理,以处理不规则列表(例如,当每个列表不包含相同数量的元素时).这在JSON和XML中非常常见,并且很难用基本函数来处理.

plyr does some very nice processing behind the scenes to deal with things like irregular lists (e.g. when each list doesn't contain the same number of elements). This is very common with JSON and XML, and is tricky to handle with the base functions.

或者使用基本功能:

> do.call("rbind", lapply(data, data.frame))

如果列表不规则,可以使用rbind.fill(来自plyr)而不是rbind,但我建议您从一开始就使用plyr,以使您的生活更轻松.

You can use rbind.fill (from plyr) instead of rbind if you have irregular lists, but I'd advise just using plyr from the beginning to make your life easier.

关于您更复杂的示例,使用Hadley的建议可以轻松解决以下问题:

Regarding your more complicated example, using Hadley's suggestion deals with this easily:

> x<-list(list(k1=2,k2=3),list(k2=100,k1=200),list(k1=5, k3=9))
> ldply(x, data.frame)
   k1  k2 k3
1   2   3 NA
2 200 100 NA
3   5  NA  9

这篇关于转置JSON字典列表以在R中进行分析的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆