如何从R中的CSV文件中提取JSON数据 [英] How to extract json data from csv file in R

查看:64
本文介绍了如何从R中的CSV文件中提取JSON数据的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试从R中的CSV文件中提取JSON数据.我对JSON和R都是陌生的,所以确实需要一些帮助.

I am trying to extract JSON data from a CSV file in R. I am new to both JSON and R, so really need some help.

我有一个CSV文件,该文件有3列-2列是 name published_date .但是,第三列 ratings 由JSON格式的数据组成.我试图提取这些数据,以便总共有一个纯列(没有其他JSON格式)的CSV文件.有人可以帮忙吗?

I have a CSV file, which has 3 columns - 2 columns are name and published_date. The third column ratings however consists of data in JSON format. I am trying to extract that data such that in all I have one CSV file with pure columns (no more JSON format). Can someone please help?

Data - 
**name** -> Test1         **published_date**-> 1151367060   **ratings** ->
[{'id': 7, 'name': 'Funny', 'count': 19645}, {'id': 1, 'name': 'Beautiful', 'count': 4573}, {'id': 9, 'name': 'Ingenious', 'count': 6073}, {'id': 3, 'name': 'Courageous', 'count': 3253}, {'id': 11, 'name': 'Longwinded', 'count': 387}, {'id': 2, 'name': 'Confusing', 'count': 242}, {'id': 8, 'name': 'Informative', 'count': 7346}, {'id': 22, 'name': 'Fascinating', 'count': 10581}, {'id': 21, 'name': 'Unconvincing', 'count': 300}, {'id': 24, 'name': 'Persuasive', 'count': 10704}, {'id': 23, 'name': 'Jaw-dropping', 'count': 4439}, {'id': 25, 'name': 'OK', 'count': 1174}, {'id': 26, 'name': 'Obnoxious', 'count': 209}, {'id': 10, 'name': 'Inspiring', 'count': 24924}]

推荐答案

首先,这是解析json数据的方法

First, here is how you can parse your json data

# if you read the data in a table with 3 column and 1 line
tab <- data.frame(name = "Test1",
           published_date = "1151367060",
           ratings ="[{'id': 7, 'name': 'Funny', 'count': 19645}, {'id': 1, 'name': 'Beautiful', 'count': 4573}, {'id': 9, 'name': 'Ingenious', 'count': 6073}, {'id': 3, 'name': 'Courageous', 'count': 3253}, {'id': 11, 'name': 'Longwinded', 'count': 387}, {'id': 2, 'name': 'Confusing', 'count': 242}, {'id': 8, 'name': 'Informative', 'count': 7346}, {'id': 22, 'name': 'Fascinating', 'count': 10581}, {'id': 21, 'name': 'Unconvincing', 'count': 300}, {'id': 24, 'name': 'Persuasive', 'count': 10704}, {'id': 23, 'name': 'Jaw-dropping', 'count': 4439}, {'id': 25, 'name': 'OK', 'count': 1174}, {'id': 26, 'name': 'Obnoxious', 'count': 209}, {'id': 10, 'name': 'Inspiring', 'count': 24924}]",
           stringsAsFactors = FALSE)

# Use jsonlite for parsing json
library(jsonlite)
# single quote is invalid, so if real, you need to replace them all by double quote
tab$ratings <- gsub("'", "\"", tab$ratings)
# parse the json
rating <- fromJSON(tab$ratings)
rating
#>    id         name count
#> 1   7        Funny 19645
#> 2   1    Beautiful  4573
#> 3   9    Ingenious  6073
#> 4   3   Courageous  3253
#> 5  11   Longwinded   387
#> 6   2    Confusing   242
#> 7   8  Informative  7346
#> 8  22  Fascinating 10581
#> 9  21 Unconvincing   300
#> 10 24   Persuasive 10704
#> 11 23 Jaw-dropping  4439
#> 12 25           OK  1174
#> 13 26    Obnoxious   209
#> 14 10    Inspiring 24924

您可以通过使用以下命令将解析保留在输入表中: tidyverse 通过管道传递工作流程和小标题.使用创建列表的能力列中,您可以将 fromJSON 结果存储在表中以代替json字符串

You can keep this parsing within the input table by using tidyverse piped workflow and tibbles. Using the ability to create list column, you can store the fromJSON result in the table in place of the json string

library(tidyverse)
tab %>%
  # convert to tibble for nice printing
  as_tibble() %>%
  # work on ratings column
  mutate(
    # replace single quote
    ratings = gsub("'", "\"", ratings),
    # create a list column with the result
    ratings= list(jsonlite::fromJSON(ratings))
  ) %>%
  # unnest the list column
  unnest()
#> # A tibble: 14 x 5
#>    name  published_date    id name1        count
#>    <chr> <chr>          <int> <chr>        <int>
#>  1 Test1 1151367060         7 Funny        19645
#>  2 Test1 1151367060         1 Beautiful     4573
#>  3 Test1 1151367060         9 Ingenious     6073
#>  4 Test1 1151367060         3 Courageous    3253
#>  5 Test1 1151367060        11 Longwinded     387
#>  6 Test1 1151367060         2 Confusing      242
#>  7 Test1 1151367060         8 Informative   7346
#>  8 Test1 1151367060        22 Fascinating  10581
#>  9 Test1 1151367060        21 Unconvincing   300
#> 10 Test1 1151367060        24 Persuasive   10704
#> 11 Test1 1151367060        23 Jaw-dropping  4439
#> 12 Test1 1151367060        25 OK            1174
#> 13 Test1 1151367060        26 Obnoxious      209
#> 14 Test1 1151367060        10 Inspiring    24924

reprex软件包(v0.1.1.9000)创建.

Created on 2018-01-14 by the reprex package (v0.1.1.9000).

这篇关于如何从R中的CSV文件中提取JSON数据的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆