tidyjson中的“记录是值而不是对象"是什么意思 [英] What does 'records are values not objects' mean in tidyjson

查看:109
本文介绍了tidyjson中的“记录是值而不是对象"是什么意思的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

根据 http://jsonlint.com/,以下字符串中的json是正确的json,但tidyjson对象是:

The json in the following string is correct json according to http://jsonlint.com/ but tidyjson objects:

library(dplyr)
library(tidyjson)

json <- '
    [{"country":"us","city":"Portland","topics":[{"urlkey":"videogame","name":"Video Games","id":4471},{"urlkey":"board-games","name":"Board Games","id":19585},{"urlkey":"computer-programming","name":"Computer programming","id":48471},{"urlkey":"opensource","name":"Open Source","id":563}],"joined":1416349237000,"link":"http://www.meetup.com/members/156440062","bio":"Analytics engineer.  Primarily work in the Hadoop space.","lon":-122.65,"other_services":{},"name":"Aaron Wirick","visited":1443078098000,"self":{"common":{}},"id":156440062,"state":"OR","lat":45.56,"status":"active"}]
    '
    json %>% as.tbl_json %>% gather_keys

我得到:

Error in gather_keys(.) : 1 records are values not objects

推荐答案

如注释之一所述,gather_keys正在寻找具有数组的对象.您可能在这里使用的是gather_array.

As mentioned in one of the comments, gather_keys is looking for objects, where you have an array. What you should probably be using here is gather_array.

此外,另一个答案使用更强力的方法来解析tidyjson包创建的JSON属性. tidyjson提供了一些方法,可以根据需要在更简洁的管道中进行处理:

Further, the other answer uses a more brute-force approach to parsing the JSON attribute that the tidyjson package creates. tidyjson provides methods for dealing with this in a bit cleaner pipeline if desired:

library(dplyr)
library(tidyjson)

json <- '
[{"country":"us","city":"Portland"
,"topics":[
 {"urlkey":"videogame","name":"Video Games","id":4471}
 ,{"urlkey":"board-games","name":"Board Games","id":19585}
 ,{"urlkey":"computer-programming","name":"Computer programming","id":48471}
 ,{"urlkey":"opensource","name":"Open Source","id":563}
]
,"joined":1416349237000
,"link":"http://www.meetup.com/members/156440062"
,"bio":"Analytics engineer.  Primarily work in the Hadoop space."
,"lon":-122.65,"other_services":{}
,"name":"Aaron Wirick","visited":1443078098000
,"self":{"common":{}}
,"id":156440062,"state":"OR","lat":45.56,"status":"active"
}]
'

mydf <- json %>% as.tbl_json %>% gather_array %>% 
spread_values(
 country=jstring('country')
 , city=jstring('city')
 , joined=jnumber('joined')
 , bio=jstring('bio')
) %>% 
enter_object('topics') %>% 
gather_array %>%
spread_values(urlkey=jstring('urlkey'))

如果数组中有多个这样的对象,则此管道确实发光.希望对您有帮助,即使事后很久也是如此!

This pipeline really shines if there are multiple such objects in the array. Hope that is helpful, even if very long after the fact!

这篇关于tidyjson中的“记录是值而不是对象"是什么意思的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆