使用R从API提取数据 [英] Extracting data from an API using R

查看:114
本文介绍了使用R从API提取数据的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我可以天蓝色访问某些遥测数据(特别是使用移动应用程序供客户使用的所有API调用).我已经使用R中的httr包在3分钟内请求数据并评估响应(显然有我自己的应用程序ID和密钥,以下未包括在内):

I have access to some telemetry data in azure (specifically all the API calls for customers using a mobile app). I have used the httr package in R to request the data over a 3 minute period and assess the response like so (obviously have my own app ID and key which I have not included below):

install.packages("httr")
library(httr)

r1 <- GET("https://api.applicationinsights.io/v1/apps/application-ID/query?timespan=PT0.05H&query=requests", add_headers("X-Api-Key" = "my-unique-key"))

r1

####### response object ########
# Response [https://api.applicationinsights.io/v1/apps/application-ID/query?timespan=PT0.05H&query=requests]
# Date: 2018-01-11 15:55
# Status: 200
# Content-Type: application/json; charset=utf-8
# Size: 84.7 kB

在环境窗口中,我可以看到r1是10的列表,并且有84,652个原始值:

In the environment window, I can see r1 is a list of 10 and that there are 84,652 raw values:

我还可以使用content函数来查看列表1:

I can also use the content function to see I have a list of 1:

r2 <- content(r1)

我确实有两个问题:

1)如何在环境窗口中理解这些输出?他们对我的数据结构有什么了解(我认为它是基于内容类型描述的JSON)

1) How do I make sense of these outputs in the environment window? What do they tell me about the structure of my data (I think it's JSON based on content type description)

2)是否可以检索数据并将其转换为表格格式(数据框)?我不明白如何查询数据.我阅读了这篇文章,但无法将其应用于我的数据: https://tclavelle.github. io/blog/r_and_apis/

2) Is there a way to retrieve the data and get it into tabular format (a data frame)? I don't understand how to query the data. I read this article, but couldn't apply it to my data: https://tclavelle.github.io/blog/r_and_apis/

任何帮助将不胜感激.

更新18年1月19日

我使用了贾林德的建议.参见下面的代码和输出:

I used jalind's suggestion. See below for code and outputs:

library(httr)
library(jsonlite)

r1 <- GET("https://api.applicationinsights.io/v1/apps/application-ID/query?timespan=PT0.05H&query=requests", add_headers("X-Api-Key" = "my-unique-key"))

#convert to a character string
r2 <- rawtoChar(r1$content)  

#check the class is character
class(r2)    

# now extract JSON from string object
r3 <- fromJSON(r2)

# convert to a data frame - this returns a data frame with columns called name, columns and rows 
x <- as.data.frame(r3[[1]])  

# column headings data frame (there are 37 columns - see example of first 3 columns below):               
c <- as.data.frame(x$columns)

#                       name      type
#                  timestamp    datetime
#                         id     string
#                     source     string

# data frame with 37 columns and all rows of telemetry data (only showing first 4 columns of this data frame):

r <- as.data.frame(x$rows)
#           X1                               X2                X3                  X4
# 1   2018-01-19T10:29:25.4Z       |aticCNxxxx=.f83assss_     <NA>          GET /Cards/Cardtype1
# 2   2018-01-19T10:29:30.226Z     |tX6Xz0xxxxx=.27cxcxae_    <NA>          GET /AddressLookup/Address
# 3   2018-01-19T10:29:45.327Z     |OgfPbicLues=.f83a9a1f_    <NA>          POST /Account/MobileDevice
# 4   2018-01-19T10:29:46.078Z     |V5MwpXXxxxxx=.f83axxxx_   <NA>          GET /Cards/Cardtype1
# 5   2018-01-19T10:30:00.427Z     |Jok8wxxxxxx=.7be33aaa_    <NA>          GET /cards/Cardtype1

推荐答案

p0bs是正确的-您应该签出jsonlite软件包.

p0bs is right - you should check out the jsonlite package.

我不知道我能否完全帮助您解决问题的第一部分,但我也许能够帮助您将JSON放入数据框.

I don't know if I can entirely help you with the first part of your question, but I might be able to help you get JSON into a data frame.

将GET函数应用于URL时,返回的原始内容为十六进制.

When you apply the GET function to a URL the raw content that is returned is in hex.

raw.result <- GET(url = url, path = path)
head(raw.result$content) ## This is in hex

十六进制很难使用,因此您可能要做的一件事是将内容转换为字符串.您可以使用rawToChar函数来完成此操作.

Hex is difficult to work with, so one thing you might do is convert the content into a string. You can do this with the rawToChar function.

text.raw.content <- rawToChar(raw.result$content)
class(text.raw.content) ## Now its a string
nchar(text.raw.content) ## How many chars?

好-现在您有了一个字符串...比十六进制更好...但仍然不是您要查找的字符串.但是您可以使用jsonlite包中的fromJSON函数从字符串对象中提取JSON.

OK - so now you have a string... which is better than hex... but still not what you are looking for. But you can use the fromJSON function in the jsonlite package to extract the JSON from the string object.

json.content <- fromJSON(text.raw.content)
class(json.content) ## It's a list
length(json.content) ## With two elements
names(json.content) ## meta and data... makes sense...
class(json.content[[2]]) ## data.frame

因此,基本上,此列表的第二个元素是转换为本地R数据帧的JSON内容.根据我的经验,在达到这一目标之后,还必须进行更多的调整工作……但是希望这可以帮助您入门.

So, basically, the second element of this list is the JSON content converted to a native R data frame. In my experience there is quite a bit more munging that must take place after you get this far... but hopefully this gets you started.

这篇关于使用R从API提取数据的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆