带有请求参数的Spark Read JSON [英] Spark Read JSON with Request Parameters
问题描述
我正在尝试从 IBM Cloud的DB2仓库文档.这要求我传递一个请求正文,其中必须提供userid
和password
作为请求参数.
I'm trying to read a JSON response from IBM Cloud's DB2 Warehouse documentation. This requires me to pass a request body wherein I have to supply userid
and password
as request parameters.
要使用spark.read.json
进行阅读,我没有发现可以提供请求参数的任何内容.无论如何,有什么可以使用的吗?
To read using spark.read.json
, I did not find anything wherein request parameters could be supplied. Is there anyway using which we could do that?
通常,我会使用scalaj-http
和play-json
库,例如,单独使用Scala读取JSON:
Usually I would read the JSON using Scala alone using scalaj-http
and play-json
libraries like:
val body = Json.obj(Constants.KEY_USERID -> userid, Constants.KEY_PASSWORD -> password)
val response = Json.parse(Http(url + Constants.KEY_ENDPOINT_AUTH_TOKENS)
.header(Constants.KEY_CONTENT_TYPE , "application/json")
.header(Constants.KEY_ACCEPT , "application/json")
.postData(body.toString())
.asString.body)
我的要求是我不能使用这两个库,而必须在spark
框架中使用scala
来完成.
My requirement is I cannot use these 2 libraries and have to do it using scala
with the spark
framework.
推荐答案
您不能直接将spark.read.json
用于REST API数据提取.
You can not use spark.read.json
directly for REST API data ingestion.
首先,发出您的API调用请求以获取响应数据,然后使用Spark将其转换为DataFrame.请注意,如果您的API是分页的,则需要进行多次调用才能获取所有数据.
First, make your API call request to get response data and then convert it to DataFrame with Spark. Note that if your API is paginated then, you'll need to make multiple calls to get all data.
对于您的示例,您需要调用身份验证终结点才能获取Bearer token
,然后将其添加到请求标头中:
For your example, you need to call authentication endpoint in order to get a Bearer token
and then add it to the request header :
Authorization: Bearer <your_token>
仅使用Scala(例如scala.io.Source.fromURL
)即可完成所有这部分操作.
All this part could be done using only Scala (example scala.io.Source.fromURL
).
获得response_data
后,请使用spark将其转换为DF:
Once you get the response_data
, use spark to convert it to DF :
import spark.implicits._
val df = spark.read.json(Seq(response_data).toDS)
这篇关于带有请求参数的Spark Read JSON的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!