在R中的JSON文件的循环中循环 [英] loop within a loop for JSON files in R

查看:152
本文介绍了在R中的JSON文件的循环中循环的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我试图将一堆JSON文件聚合为一个文件,以获取三个来源和三年的时间.到目前为止,我只能通过乏味的方式来做到这一点,但我确信我可以以一种更聪明,更优雅的方式来做到这一点.

I am trying to aggregate a bunch of JSON files in to a single one for three sources and three years. While so far I have only been able to do it through the tedious way, I am sure I could do it in a smarter and more elegant manner.

json1 <- lapply(readLines("NYT_1989.json"), fromJSON)
json2 <- lapply(readLines("NYT_1990.json"), fromJSON)
json3 <- lapply(readLines("NYT_1991.json"), fromJSON)
json4 <- lapply(readLines("WP_1989.json"), fromJSON)
json5 <- lapply(readLines("WP_1990.json"), fromJSON)
json6 <- lapply(readLines("WP_1991.json"), fromJSON)
json7 <- lapply(readLines("USAT_1989.json"), fromJSON)
json8 <- lapply(readLines("USAT_1990.json"), fromJSON)
json9 <- lapply(readLines("USAT_1991.json"), fromJSON)

jsonl <- list(json1, json2, json3, json4, json5, json6, json7, json8, json9)

请注意,从1989年到1991年这三个文件的年期间都一样.有什么想法吗?谢谢!

Note that the year period goes equally for the three files from 1989 to 1991. Any ideas? Thanks!

PS:每个文件中数据的示例:

PS: Example of the data inside each file:

{"date": "December 31, 1989, Sunday, Late Edition - Final", "body": "Frigid temperatures across much of the United States this month sent demand for heating oil soaring, providing a final upward jolt to crude oil prices. Some spot crude traded at prices up 40 percent or more from a year ago. Will these prices hold? Five experts on oil offer their views. That's assuming the economy performs as expected - about 1 percent growth in G.N.P. The other big uncertainty is the U.S.S.R. If their production drops more than 4 percent, prices could stengthen. ", "title": "Prospects;"}
{"date": "December 31, 1989, Sunday, Late Edition - Final", "body": "DATELINE: WASHINGTON, Dec. 30 For years, experts have dubbed Czechoslovakia's spy agency the ''two Czech'' service. But he cautioned against euphoria. ''The Soviets wouldn't have relied on just official cooperation,'' he said. ''It would be surprising if they haven't unilaterally penetrated friendly services with their own agents, too.'' ", "title": "Upheaval in the East: Espionage;"}
{"date": "December 31, 1989, Sunday, Late Edition - Final", "body": "SURVIVING the decline in the economy will be the overriding issue for 1990, say leaders of the county's business community. Successful Westchester business owners will face and overcome these risks and obstacles. Westchester is a land of opportunity for the business owner. ", "title": "Coping With the Economic Prospects of 1990"}

推荐答案

在这里:

require(jsonlite)

filelist <- c("NYT_1989.json","NYT_1990.json","NYT_1991.json",
              "WP_1989.json", "WP_1990.json","WP_1991.json",
              "USAT_1989.json","USAT_1990.json","USAT_1991.json")

newJSON <- sapply(filelist, function(x) fromJSON(readLines(x)))


仅从输入文件的每一行中读取body条目.

您询问如何仅读取JSON文件的子集.引用的文件数据实际上不是JSON格式.就像JSON一样,因此我们必须将输入修改为fromJSON()才能正确读取数据.我们从fromJSON()$body解引用结果以仅提取body变量.


Read in just the body entry from each line of the input file.

You asked about how to just read in a subset of the JSON file. The file data referenced isn't actually JSON format. It is JSON like, hence we have to modify the input to fromJSON() to correctly read in the data. We dereference the result from fromJSON()$body to extract just the body variable.

filelist <- c("./data/NYT_1989.json", "./data/NYT_1990.json")
newJSON <- sapply(filelist, function(x) fromJSON(sprintf("[%s]", paste(readLines(x), collapse = ",")), flatten = FALSE)$body)
newJSON

结果

> filelist <- c("./data/NYT_1989.json", "./data/NYT_1990.json")
> newJSON <- sapply(filelist, function(x) fromJSON(sprintf("[%s]", paste(readLines(x), collapse = ",")), flatten = FALSE)$body)
> newJSON
     ./data/NYT_1989.json                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  
[1,] "Frigid temperatures across much of the United States this month sent demand for heating oil soaring, providing a final upward jolt to crude oil prices. Some spot crude traded at prices up 40 percent or more from a year ago. Will these prices hold? Five experts on oil offer their views. That's assuming the economy performs as expected - about 1 percent growth in G.N.P. The other big uncertainty is the U.S.S.R. If their production drops more than 4 percent, prices could stengthen. "
[2,] "DATELINE: WASHINGTON, Dec. 30 For years, experts have dubbed Czechoslovakia's spy agency the ''two Czech'' service. But he cautioned against euphoria. ''The Soviets wouldn't have relied on just official cooperation,'' he said. ''It would be surprising if they haven't unilaterally penetrated friendly services with their own agents, too.'' "                                                                                                                                                
[3,] "SURVIVING the decline in the economy will be the overriding issue for 1990, say leaders of the county's business community. Successful Westchester business owners will face and overcome these risks and obstacles. Westchester is a land of opportunity for the business owner. "                                                                                                                                                                                                                  
     ./data/NYT_1990.json                                                                                                                                                                                                                                                                                                                                                                                                                                                                                
[1,] "Blue temperatures across much of the United States this month sent demand for heating oil soaring, providing a final upward jolt to crude oil prices. Some spot crude traded at prices up 40 percent or more from a year ago. Will these prices hold? Five experts on oil offer their views. That's assuming the economy performs as expected - about 1 percent growth in G.N.P. The other big uncertainty is the U.S.S.R. If their production drops more than 4 percent, prices could stengthen. "
[2,] "BLUE1: WASHINGTON, Dec. 30 For years, experts have dubbed Czechoslovakia's spy agency the ''two Czech'' service. But he cautioned against euphoria. ''The Soviets wouldn't have relied on just official cooperation,'' he said. ''It would be surprising if they haven't unilaterally penetrated friendly services with their own agents, too.'' "                                                                                                                                                 
[3,] "GREEN4 the decline in the economy will be the overriding issue for 1990, say leaders of the county's business community. Successful Westchester business owners will face and overcome these risks and obstacles. Westchester is a land of opportunity for the business owner. "


您可能会发现以下应用教程很有用:


You might find the following apply tutorial useful:

我还建议阅读:

  • R Inferno - Chapter 4 - Over-Vectorizing

当我说这本在线免费书对我有很大帮助时,请相信我.它也已经多次证实我是白痴:-)

trust my when I say this online free book has helped me a lot. It has also confirmed I am an idiot on multiple occasions :-)

这篇关于在R中的JSON文件的循环中循环的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆