R:读取并解析Json [英] R: read and parse Json

查看:122
本文介绍了R:读取并解析Json的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

如果R不适合这份工作,那还算公平,但我认为应该是.

If R is not suitable for this job then fair enough but I believe it should be.

我正在调用一个API,然后将结果转储到Postman json阅读器中.然后我得到如下结果:

I am calling an API, then dumping the results into Postman json reader. Then I get results like:

 "results": [
    {
      "personUuid": "***",
      "synopsis": {
        "fullName": "***",
        "headline": "***",
        "location": "***",
        "image": "***",
        "skills": [
          "*",
          "*",
          "*",
          "*.",
          "*"
        ],
        "phoneNumbers": [
          "***",
          "***"
        ],
        "emailAddresses": [
          "***"
        ],
        "networks": [
          {
            "name": "linkedin",
            "url": "***",
            "type": "canonicalUrl",
            "lastAccessed": null
          },
          {
            "name": "***",
            "url": "***",
            "type": "cvUrl",
            "lastAccessed": "*"
          },
          {
            "name": "*",
            "url": "***",
            "type": "cvUrl",
            "lastAccessed": "*"
          }
        ]
      }
    },
    {

首先,我不确定如何将其导入R,因为我主要处理的是csv.我还看到了其他问题,人们使用Json包直接调用URL,但这与我正在做的事情不起作用,因此我想知道如何读取其中包含json的csv.

Firstly I'm not sure on how to import this into R as I've mainly dealt with csv's. I've seen other questions where people use Json packages to call the URL directly but that's not going to work with what I'm doing so I'd like to know how to read a csv with json in it.

我用过:

x <- fromJSON(file="Z:/json.csv")

但是也许有更好的方法.完成此操作后,json看起来更像:

But perhaps theres a better way. Once this is done the json looks more like:

...$results[[9]]$synopsis$emailAddresses
[1] "***" "***"          
[3] "***"                "***"          

$results[[9]]$synopsis$networks...

然后,我希望为每个结果存储标题,然后将电子邮件地址存储到数据表中.

Then what I would like for each result is to store the headline and then email address into a data table.

我尝试过:

str_extract_all(x, 'emailAddresses*$')

但是我认为*代表了emailAddresses和$之间的所有内容,包括换行符等,但这是行不通的.当您使用*进行工作时,我也发现它具有提取功能,但它并没有提取*表示什么.

However I figured * would represent everything between emailAddresses and the $ including new lines etc, however this doesn't work. I also find with extract when you do get * to work, it doesnt extract what * represents.

例如:

> y <- 'some text. email "oli@oli.o" other text'
> y
[1] "some text. email \"oli@oli.o\" other text"
> str_extract_all(y, 'email \"*"')
[[1]]
[1] "email \""

PART 2:

以下答案有效,但是,如果我直接调用api:

The answers below worked, however if I call the api directly:

body ='{"start": 0,"count": 105,...}'

x <- POST(url="https://live.*.me/api/v3/person", body=body, add_headers(Accept="application/json", 'Content-Type'="application/json", Authorization = "id=*, apiKey=*"))

y <- content(x)

然后使用

fromJSON(y, flatten=TRUE)$results[c("synopsis.headline",  
                                            "synopsis.emailAddresses")]

不起作用.我尝试了以下方法:

Does not work. I tried the following:

z <- NULL
zz <- NULL

for(i in 1:y$count){
     z=rbind(z,data.table(job = y$results[[i]]$synopsis$headline))
 }
 for(i in 1:y$count){
       zz=rbind(zz,data.table(job = y$results[[i]]$synopsis$emailAddresses))
   }
df <- cbind(z,zz)

但是,当返回JSON列表时,有些人会收到多封电子邮件.因此,上述方法仅记录每个人的第一封电子邮件,我如何将多封电子邮件另存为矢量(而不是多列)?

However when the JSON list is returned, some people have multiple emails. Thus the method above only records the first email for each person, how would I save the multi emails as a vector (rather than having multiple columns)?

推荐答案

更新1: 要从URL读取json,您可以简单地使用fromJSON函数,并将字符串与json数据url一起传递:

UPDATE 1: to read the json from a URL you can simply use the fromJSON function, passing the string with your json data url:

library(jsonlite)

url <- 'http://you.url.com/data.json'

# in this case we pass an URL to the fromJSON function instead of the actual content we want to parse
fromJSON(url, flatten=TRUE)$results[c("synopsis.headline", "synopsis.emailAddresses")] 

// end UPDATE 1

您还可以将 flatten 参数传递给fromJSON,然后使用结果"数据框.

you could also pass the flatten param to fromJSON and then use the 'results' dataframe.

fromJSON(json.data, flatten=TRUE)$results[c("synopsis.headline",  
                                            "synopsis.emailAddresses")]

synopsis.headline synopsis.emailAddresses
1               ***        jane.doe@boo.com
2               ***        john.doe@foo.com

这是我定义json.data的方式,请注意,我有意向您的示例输入json添加了1条记录.

here is how I defined json.data, please note I intentionally added 1 more record to your sample input json.

 json.data <- '{
      "results":[  
        {  
          "personUuid":"***",
          "synopsis":{  
            "fullName":"***",
            "headline":"***",
            "location":"***",
            "image":"***",
            "skills":[  
              "*",
              "*",
              "*",
              "*.",
              "*"
              ],
            "phoneNumbers":[  
              "***",
              "***"
              ],
            "emailAddresses":[  
              "jane.doe@boo.com"
              ],
            "networks":[  
              {  
                "name":"linkedin",
                "url":"***",
                "type":"canonicalUrl",
                "lastAccessed":null
              },
              {  
                "name":"***",
                "url":"***",
                "type":"cvUrl",
                "lastAccessed":"*"
              },
              {  
                "name":"*",
                "url":"***",
                "type":"cvUrl",
                "lastAccessed":"*"
              }
              ]
          }
        },
        {  
          "personUuid":"***",
          "synopsis":{  
            "fullName":"***",
            "headline":"***",
            "location":"***",
            "image":"***",
            "skills":[  
              "*",
              "*",
              "*",
              "*.",
              "*"
              ],
            "phoneNumbers":[  
              "***",
              "***"
              ],
            "emailAddresses":[  
              "john.doe@foo.com"
              ],
            "networks":[  
              {  
                "name":"linkedin",
                "url":"***",
                "type":"canonicalUrl",
                "lastAccessed":null
              },
              {  
                "name":"***",
                "url":"***",
                "type":"cvUrl",
                "lastAccessed":"*"
              },
              {  
                "name":"*",
                "url":"***",
                "type":"cvUrl",
                "lastAccessed":"*"
              }
              ]
          }
        }
        ]
    }'

这篇关于R:读取并解析Json的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆