R:JSON到data.frame的向量 [英] R: Vector of JSONs to data.frame

查看:132
本文介绍了R:JSON到data.frame的向量的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个JSON向量(具有相同的结构),并将其转换为data.frame.以下示例正是我想要的.

I have a vector of JSONs (of the same structure) and transform it to a data.frame. Following example does exactly what I want.

require(jsonlite)   # fromJSON()
require(magrittr)   # for the pipeline only
require(data.table) # rbindlist()

jsons <- c('{"num":1,"char":"a","list":{"x":1,"y":2}}',
           '{"num":2,"char":"b","list":{"x":1,"y":2}}',
           '{"num":3,"char":"c","list":{"x":1,"y":2}}')

df <- jsons %>%
  lapply(fromJSON) %>%
  lapply(as.data.frame.list, stringsAsFactors = F) %>%
  rbindlist(fill = T)

JSON的某些元素是对象,即,如果将其转换为fromJSON(),列表的某些元素也将是列表.我不能对每个列表使用unlist(),因为我有不同的变量类型,所以我正在使用as.data.frame.list()函数.但是,对于每个JSON而言,这样做太慢了.有什么方法可以更有效地做到这一点?

Some elements of the JSON are objects, i.e. if I transform it fromJSON() some elements of the list will be lists as well. I cannot use unlist() to each list because I have different variable types so I am using as.data.frame.list() function. This is however too slow to do for each JSON individually. Is there a way how can I do it more effectively?

json <- '{"$schema":"http://json-schema.org/draft-04/schema#","title":"Product set","type":"array","items":{"title":"Product","type":"object","properties":{"id":{"description":"The unique identifier for a product","type":"number"},"name":{"type":"string"},"price":{"type":"number","minimum":0,"exclusiveMinimum":true},"tags":{"type":"array","items":{"type":"string"},"minItems":1,"uniqueItems":true},"dimensions":{"type":"object","properties":{"length":{"type":"number"},"width":{"type":"number"},"height":{"type":"number"}},"required":["length","width","height"]},"warehouseLocation":{"description":"Coordinates of the warehouse with the product","$ref":"http://json-schema.org/geo"}},"required":["id","name","price"]}}'
system.time(
  df <- json %>% rep(1000) %>%
    lapply(fromJSON) %>%
    lapply(as.data.frame.list, stringsAsFactors = F) %>%
    rbindlist(fill = T)
) # 2.72

我知道有很多类似的问题,但是我看到的大多数答案都是关于使用as.data.frame()data.frame()的.没有人提到速度.也许对此没有更好的解决方案.

I know that there are plenty of similar questions but most of the answers I saw was about using as.data.frame() or data.frame(). Nobody mentioned the speed. Maybe there is no better solution to this.

推荐答案

我终于找到了 answer .它将在CRAN 很快.

I finally found the answer. It will be on CRAN soon.

devtools::install_github("jeremystan/tidyjson")
tidyjson::spread_all()

此功能比上面的示例快10倍.

This function is about 10-times faster than my example above.

这篇关于R:JSON到data.frame的向量的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆