如何从R中的CSV文件中提取JSON数据 [英] How to extract json data from csv file in R
问题描述
我正在尝试从R中的CSV文件中提取JSON数据.我对JSON和R都是陌生的,所以确实需要一些帮助.
I am trying to extract JSON data from a CSV file in R. I am new to both JSON and R, so really need some help.
我有一个CSV文件,该文件有3列-2列是 name
和 published_date
.但是,第三列 ratings
由JSON格式的数据组成.我试图提取这些数据,以便总共有一个纯列(没有其他JSON格式)的CSV文件.有人可以帮忙吗?
I have a CSV file, which has 3 columns - 2 columns are name
and published_date
. The third column ratings
however consists of data in JSON format. I am trying to extract that data such that in all I have one CSV file with pure columns (no more JSON format). Can someone please help?
Data -
**name** -> Test1 **published_date**-> 1151367060 **ratings** ->
[{'id': 7, 'name': 'Funny', 'count': 19645}, {'id': 1, 'name': 'Beautiful', 'count': 4573}, {'id': 9, 'name': 'Ingenious', 'count': 6073}, {'id': 3, 'name': 'Courageous', 'count': 3253}, {'id': 11, 'name': 'Longwinded', 'count': 387}, {'id': 2, 'name': 'Confusing', 'count': 242}, {'id': 8, 'name': 'Informative', 'count': 7346}, {'id': 22, 'name': 'Fascinating', 'count': 10581}, {'id': 21, 'name': 'Unconvincing', 'count': 300}, {'id': 24, 'name': 'Persuasive', 'count': 10704}, {'id': 23, 'name': 'Jaw-dropping', 'count': 4439}, {'id': 25, 'name': 'OK', 'count': 1174}, {'id': 26, 'name': 'Obnoxious', 'count': 209}, {'id': 10, 'name': 'Inspiring', 'count': 24924}]
推荐答案
首先,这是解析json数据的方法
First, here is how you can parse your json data
# if you read the data in a table with 3 column and 1 line
tab <- data.frame(name = "Test1",
published_date = "1151367060",
ratings ="[{'id': 7, 'name': 'Funny', 'count': 19645}, {'id': 1, 'name': 'Beautiful', 'count': 4573}, {'id': 9, 'name': 'Ingenious', 'count': 6073}, {'id': 3, 'name': 'Courageous', 'count': 3253}, {'id': 11, 'name': 'Longwinded', 'count': 387}, {'id': 2, 'name': 'Confusing', 'count': 242}, {'id': 8, 'name': 'Informative', 'count': 7346}, {'id': 22, 'name': 'Fascinating', 'count': 10581}, {'id': 21, 'name': 'Unconvincing', 'count': 300}, {'id': 24, 'name': 'Persuasive', 'count': 10704}, {'id': 23, 'name': 'Jaw-dropping', 'count': 4439}, {'id': 25, 'name': 'OK', 'count': 1174}, {'id': 26, 'name': 'Obnoxious', 'count': 209}, {'id': 10, 'name': 'Inspiring', 'count': 24924}]",
stringsAsFactors = FALSE)
# Use jsonlite for parsing json
library(jsonlite)
# single quote is invalid, so if real, you need to replace them all by double quote
tab$ratings <- gsub("'", "\"", tab$ratings)
# parse the json
rating <- fromJSON(tab$ratings)
rating
#> id name count
#> 1 7 Funny 19645
#> 2 1 Beautiful 4573
#> 3 9 Ingenious 6073
#> 4 3 Courageous 3253
#> 5 11 Longwinded 387
#> 6 2 Confusing 242
#> 7 8 Informative 7346
#> 8 22 Fascinating 10581
#> 9 21 Unconvincing 300
#> 10 24 Persuasive 10704
#> 11 23 Jaw-dropping 4439
#> 12 25 OK 1174
#> 13 26 Obnoxious 209
#> 14 10 Inspiring 24924
您可以通过使用以下命令将解析保留在输入表中: tidyverse
通过管道传递工作流程和小标题.使用创建列表的能力列中,您可以将 fromJSON
结果存储在表中以代替json字符串
You can keep this parsing within the input table by using
tidyverse
piped workflow and tibbles. Using the ability to create list
column, you can store the fromJSON
result in the table in place of the json string
library(tidyverse)
tab %>%
# convert to tibble for nice printing
as_tibble() %>%
# work on ratings column
mutate(
# replace single quote
ratings = gsub("'", "\"", ratings),
# create a list column with the result
ratings= list(jsonlite::fromJSON(ratings))
) %>%
# unnest the list column
unnest()
#> # A tibble: 14 x 5
#> name published_date id name1 count
#> <chr> <chr> <int> <chr> <int>
#> 1 Test1 1151367060 7 Funny 19645
#> 2 Test1 1151367060 1 Beautiful 4573
#> 3 Test1 1151367060 9 Ingenious 6073
#> 4 Test1 1151367060 3 Courageous 3253
#> 5 Test1 1151367060 11 Longwinded 387
#> 6 Test1 1151367060 2 Confusing 242
#> 7 Test1 1151367060 8 Informative 7346
#> 8 Test1 1151367060 22 Fascinating 10581
#> 9 Test1 1151367060 21 Unconvincing 300
#> 10 Test1 1151367060 24 Persuasive 10704
#> 11 Test1 1151367060 23 Jaw-dropping 4439
#> 12 Test1 1151367060 25 OK 1174
#> 13 Test1 1151367060 26 Obnoxious 209
#> 14 Test1 1151367060 10 Inspiring 24924
由 reprex软件包(v0.1.1.9000)创建.
Created on 2018-01-14 by the reprex package (v0.1.1.9000).
这篇关于如何从R中的CSV文件中提取JSON数据的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!