将导入的json数据导入数据框 [英] Getting imported json data into a data frame
问题描述
我有一个包含超过1500个json对象的文件,我想在R中使用。我已经能够将数据作为列表导入,但是很难将其强制转换为有用的结构。我想为每个json对象创建一个包含行的数据框,并为每个key:value对创建一个列。
I have a file containing over 1500 json objects that I want to work with in R. I've been able to import the data as a list, but am having trouble coercing it into a useful structure. I want to create a data frame containing a row for each json object and a column for each key:value pair.
我用这个小的伪造重建了我的情况数据集:
I've recreated my situation with this small, fake data set:
[{"name":"Doe, John","group":"Red","age (y)":24,"height (cm)":182,"wieght (kg)":74.8,"score":null},
{"name":"Doe, Jane","group":"Green","age (y)":30,"height (cm)":170,"wieght (kg)":70.1,"score":500},
{"name":"Smith, Joan","group":"Yellow","age (y)":41,"height (cm)":169,"wieght (kg)":60,"score":null},
{"name":"Brown, Sam","group":"Green","age (y)":22,"height (cm)":183,"wieght (kg)":75,"score":865},
{"name":"Jones, Larry","group":"Green","age (y)":31,"height (cm)":178,"wieght (kg)":83.9,"score":221},
{"name":"Murray, Seth","group":"Red","age (y)":35,"height (cm)":172,"wieght (kg)":76.2,"score":413},
{"name":"Doe, Jane","group":"Yellow","age (y)":22,"height (cm)":164,"wieght (kg)":68,"score":902}]
th的一些特征e data:
Some features of the data:
- 对象都包含相同数量的key:value对,尽管
某些值为null - 每个对象有两个非数字列(名称和组)
- name是唯一标识符,有10个左右的组
- 许多名称和组名包含空格,逗号和其他标点符号。
- The objects all contain the same number of key:value pairs although some of the values are null
- There are two non-numeric columns per object (name and group)
- name is the unique identifier, there are 10 or so groups
- many of the name and group entires contain spaces, commas and other punctuation.
基于此问题: R list(structure(list()))到数据框,我试过以下:
Based on this question: R list(structure(list())) to data frame, I tried the following:
json_file <- "test.json"
json_data <- fromJSON(json_file)
asFrame <- do.call("rbind.fill", lapply(json_data, as.data.frame))
使用我的真实数据和这些假数据,最后一行给出了这个错误:
With both my real data and this fake data, the last line give me this error:
Error in data.frame(name = "Doe, John", group = "Red", `age (y)` = 24, :
arguments imply differing number of rows: 1, 0
推荐答案
你只需要用NA替换你的NULL:
You just need to replace your NULLs with NAs:
require(RJSONIO)
json_file <- '[{"name":"Doe, John","group":"Red","age (y)":24,"height (cm)":182,"wieght (kg)":74.8,"score":null},
{"name":"Doe, Jane","group":"Green","age (y)":30,"height (cm)":170,"wieght (kg)":70.1,"score":500},
{"name":"Smith, Joan","group":"Yellow","age (y)":41,"height (cm)":169,"wieght (kg)":60,"score":null},
{"name":"Brown, Sam","group":"Green","age (y)":22,"height (cm)":183,"wieght (kg)":75,"score":865},
{"name":"Jones, Larry","group":"Green","age (y)":31,"height (cm)":178,"wieght (kg)":83.9,"score":221},
{"name":"Murray, Seth","group":"Red","age (y)":35,"height (cm)":172,"wieght (kg)":76.2,"score":413},
{"name":"Doe, Jane","group":"Yellow","age (y)":22,"height (cm)":164,"wieght (kg)":68,"score":902}]'
json_file <- fromJSON(json_file)
json_file <- lapply(json_file, function(x) {
x[sapply(x, is.null)] <- NA
unlist(x)
})
一旦你有一个非空值元素,你可以调用 rbind
而不会收到错误:
Once you have a non-null value for each element, you can call rbind
without getting an error:
do.call("rbind", json_file)
name group age (y) height (cm) wieght (kg) score
[1,] "Doe, John" "Red" "24" "182" "74.8" NA
[2,] "Doe, Jane" "Green" "30" "170" "70.1" "500"
[3,] "Smith, Joan" "Yellow" "41" "169" "60" NA
[4,] "Brown, Sam" "Green" "22" "183" "75" "865"
[5,] "Jones, Larry" "Green" "31" "178" "83.9" "221"
[6,] "Murray, Seth" "Red" "35" "172" "76.2" "413"
[7,] "Doe, Jane" "Yellow" "22" "164" "68" "902"
这篇关于将导入的json数据导入数据框的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!