将导入的json数据导入R中的数据帧 [英] Getting imported json data into a data frame in R

查看:123
本文介绍了将导入的json数据导入R中的数据帧的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个包含超过1500个json对象的文件,我想在R中使用。我已经能够将数据作为列表导入,但是在将其强制为有用的结构时遇到困难。我想为每个json对象创建一个包含一行的数据框,并为每个键创建一个列:值对。



我已经用这个小的假的数据集:

  [{name:Doe,John,group:Red y):24,height(cm):182,wieght(kg):74.8score:null},
{name:Doe,Jane,group green,age(y):30,height(cm):170,wieght(kg):70.1,score:500},
{name琼斯,组:黄色,年龄(y):41,身高(厘米):169,维哥(kg):60,分数:null},
{ 名称:布朗,山姆,组:绿色,年龄(y):22,高度(厘米):183,wieght(kg):75,得分 },
{name:Jones,Larry,group:Green,age(y):31,height(cm):178,wieght(kg) 83.9,score:221},
{name:Murray,Seth,group:Red,age(y):35,height(cm) wieght(kg):76.2,score:413},
{name:Doe,Jane,group:Yellow,age(y):22, (cm):164,wieght(kg):68,score:902}]

的一些功能e数据:




  • 对象都包含相同数量的键:值对,但
    某些值为null

  • 每个对象有两个非数字列(名称和组)

  • name是唯一的标识符,有10个左右的组

  • 许多名称和组别包含空格,逗号和其他标点符号。



根据此问题: R list(structure(list()))到数据框,我尝试了以下:

  json_file<  - test.json
json_data< - fromJSON(json_file)
asFrame < - do.call(rbind.fill,lapply(json_data,as.data.frame))

与我的真实数据和这个假数据一起,最后一行给我这个错误:

 数据中的错误。 frame(name =Doe,John,group =Red,`age(y)`= 24,:
arguments imply dif fering行数:1,0


解决方案

你只需要用NAs替换你的NULL:

  require(RJSONIO)

json_file< - '[{ name:Doe,John,group:Red,age(y):24,height(cm):182,wieght(kg):74.8,score },
{name:Doe,Jane,group:Green,age(y):30,height(cm):170,wieght(kg) 70.1,score:500},
{name:Smith,Joan,group:Yellow,age(y):41,height(cm) wieght(kg):60,score:null},
{name:Brown,Sam,group:Green,age(y):22, (cm):183,wieght(kg):75,score:865},
{name:Jones,Larry,group:Green ):31,height(cm):178,wieght(kg):83.9,score:221},
{name:Murray,Seth,group红色,年龄(y):35,身高(cm):172,wieght(kg):76.2,score:413},
{name ,group:Yellow,age(y):22,height(cm):164,wieght(kg):68,score:902}]'


j son_file< - fromJSON(json_file)

json_file< - lapply(json_file,function(x){
x [sapply(x,is.null)]< - NA
unlist(x)
})

一旦你有一个非空值元素,您可以调用 rbind 而不会收到错误:

  do。 call(rbind,json_file)
name group age(y)height(cm)wieght(kg)score
[1,]Doe,JohnRed24182 74.8NA
[2,]Doe,Jane绿色3017070.1500
[3,]史密斯,琼黄41 16960NA
[4,]布朗,山姆绿色2218375865
[5,]琼斯,拉里 3117883.9221
[6,]Murray,Seth红色3517276.2413
[7,] Doe,Jane黄色2216468902


I have a file containing over 1500 json objects that I want to work with in R. I've been able to import the data as a list, but am having trouble coercing it into a useful structure. I want to create a data frame containing a row for each json object and a column for each key:value pair.

I've recreated my situation with this small, fake data set:

[{"name":"Doe, John","group":"Red","age (y)":24,"height (cm)":182,"wieght (kg)":74.8,"score":null},
{"name":"Doe, Jane","group":"Green","age (y)":30,"height (cm)":170,"wieght (kg)":70.1,"score":500},
{"name":"Smith, Joan","group":"Yellow","age (y)":41,"height (cm)":169,"wieght (kg)":60,"score":null},
{"name":"Brown, Sam","group":"Green","age (y)":22,"height (cm)":183,"wieght (kg)":75,"score":865},
{"name":"Jones, Larry","group":"Green","age (y)":31,"height (cm)":178,"wieght (kg)":83.9,"score":221},
{"name":"Murray, Seth","group":"Red","age (y)":35,"height (cm)":172,"wieght (kg)":76.2,"score":413},
{"name":"Doe, Jane","group":"Yellow","age (y)":22,"height (cm)":164,"wieght (kg)":68,"score":902}]

Some features of the data:

  • The objects all contain the same number of key:value pairs although some of the values are null
  • There are two non-numeric columns per object (name and group)
  • name is the unique identifier, there are 10 or so groups
  • many of the name and group entires contain spaces, commas and other punctuation.

Based on this question: R list(structure(list())) to data frame, I tried the following:

json_file <- "test.json"
json_data <- fromJSON(json_file)
asFrame <- do.call("rbind.fill", lapply(json_data, as.data.frame))

With both my real data and this fake data, the last line give me this error:

Error in data.frame(name = "Doe, John", group = "Red", `age (y)` = 24,  : 
  arguments imply differing number of rows: 1, 0

解决方案

You just need to replace your NULLs with NAs:

require(RJSONIO)    

json_file <-  '[{"name":"Doe, John","group":"Red","age (y)":24,"height (cm)":182,"wieght (kg)":74.8,"score":null},
    {"name":"Doe, Jane","group":"Green","age (y)":30,"height (cm)":170,"wieght (kg)":70.1,"score":500},
    {"name":"Smith, Joan","group":"Yellow","age (y)":41,"height (cm)":169,"wieght (kg)":60,"score":null},
    {"name":"Brown, Sam","group":"Green","age (y)":22,"height (cm)":183,"wieght (kg)":75,"score":865},
    {"name":"Jones, Larry","group":"Green","age (y)":31,"height (cm)":178,"wieght (kg)":83.9,"score":221},
    {"name":"Murray, Seth","group":"Red","age (y)":35,"height (cm)":172,"wieght (kg)":76.2,"score":413},
    {"name":"Doe, Jane","group":"Yellow","age (y)":22,"height (cm)":164,"wieght (kg)":68,"score":902}]'


json_file <- fromJSON(json_file)

json_file <- lapply(json_file, function(x) {
  x[sapply(x, is.null)] <- NA
  unlist(x)
})

Once you have a non-null value for each element, you can call rbind without getting an error:

do.call("rbind", json_file)
     name           group    age (y) height (cm) wieght (kg) score
[1,] "Doe, John"    "Red"    "24"    "182"       "74.8"      NA   
[2,] "Doe, Jane"    "Green"  "30"    "170"       "70.1"      "500"
[3,] "Smith, Joan"  "Yellow" "41"    "169"       "60"        NA   
[4,] "Brown, Sam"   "Green"  "22"    "183"       "75"        "865"
[5,] "Jones, Larry" "Green"  "31"    "178"       "83.9"      "221"
[6,] "Murray, Seth" "Red"    "35"    "172"       "76.2"      "413"
[7,] "Doe, Jane"    "Yellow" "22"    "164"       "68"        "902"

这篇关于将导入的json数据导入R中的数据帧的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆