Spark REST API难以理解,目标是从网页发送RESTful消息 [英] Spark REST API difficulties in understanding, goal sending RESTful messages from webpage

查看:60
本文介绍了Spark REST API难以理解,目标是从网页发送RESTful消息的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

对于一个项目,我想通过网页运行Spark.这里的目标是动态提交提交请求和状态更新.作为启发,我使用了以下链接:提交以下Spark请求后,我正在发送REST请求以检查Spark提交: http://arturmkrtchyan.com/apache-spark-hidden-rest-api

For a project I would like to run Spark via a webpage. Here the goal is to submit dynamically submission requests and status updates. As inspiration I used the following weblink: When asking for I am sending a REST request for checking spark submission after submitting the below Spark request: http://arturmkrtchyan.com/apache-spark-hidden-rest-api

Spark作业提交的请求代码如下:

The Request code for a Spark job submission is the following:

curl -X POST http://sparkmasterIP:6066/v1/submissions/create --header "Content-Type:application/json;charset=UTF-8" --data '{
  "action" : "CreateSubmissionRequest",
  "appArgs" : [ "/home/opc/TestApp.jar"],
  "appResource" : "file:/home/opc/TestApp.jar",
  "clientSparkVersion" : "1.6.0",
  "environmentVariables" : {
    "SPARK_ENV_LOADED" : "1"
  },
  "mainClass" : "com.Test",
  "sparkProperties" : {
    "spark.driver.supervise" : "false",
    "spark.app.name" : "TestJob",
    "spark.eventLog.enabled": "true",
    "spark.submit.deployMode" : "cluster",
    "spark.master" : "spark://sparkmasterIP:6066"
  }
}'

Response:
{
  "action" : "CreateSubmissionResponse",
  "message" : "Driver successfully submitted as driver-20170302152313-0044",
  "serverSparkVersion" : "1.6.0",
  "submissionId" : "driver-20170302152313-0044",
  "success" : true
}

在询问提交状态时,存在一些困难.为了请求提交状态,我使用了上面响应代码中显示的submittingId.因此,使用了以下命令:

When asking for the submission status there were some difficulties. To request the submission status I used the submissionId displayed in the response code above. So the following command was used:

curl http://masterIP:6066/v1/submissions/status/driver-20170302152313-0044

提交状态响应"包含以下错误:

The Response for Submission Status contained the following error:

"message" : "Exception from the cluster:\njava.io.FileNotFoundException: /home/opc/TestApp.jar denied)\n\tjava.io.FileInputStream.open0(Native Method)\n\tjava.io.FileInputStream.open(FileInputStream.java:195)\n\tjava.io.FileInputStream.<init>(FileInputStream.java:138)\n\torg.spark-project.guava.io.Files$FileByteSource.openStream(Files.java:124)\n\torg.spark-project.guava.io.Files$FileByteSource.openStream(Files.java:114)\n\torg.spark-project.guava.io.ByteSource.copyTo(ByteSource.java:202)\n\torg.spark-project.guava.io.Files.copy(Files.java:436)\n\torg.apache.spark.util.Utils$.org$apache$spark$util$Utils$$copyRecursive(Utils.scala:540)\n\torg.apache.spark.util.Utils$.copyFile(Utils.scala:511)\n\torg.apache.spark.util.Utils$.doFetchFile(Utils.scala:596)\n\torg.apache.spark.util.Utils$.fetchFile(Utils.scala:395)\n\torg.apache.spark.deploy.worker.DriverRunner.org$apache$spark$deploy$worker$DriverRunner$$downloadUserJar(DriverRunner.scala:150)\n\torg.apache.spark.deploy.worker.DriverRunner$$anon$1.run(DriverRunner.scala:79)",

我的问题是如何使用这样的API,以便可以获取提交状态.如果还有另一个可以获取正确状态的API,那么我想简要介绍一下该API如何以 RESTful 的方式工作.

My question is how to use such an API, in such a way that the submission status can be obtained. If there is another API where the correct status can be obtained, then I would like a short description of how this API works in a RESTful way.

谢谢

推荐答案

如博客 http://arturmkrtchyan.com/apache-spark-hidden-rest-api ,更多评论者也遇到了此问题.在下面,我将尝试解释一些可能的原因.

As noted down in the comments of the blog http://arturmkrtchyan.com/apache-spark-hidden-rest-api , some more commenter's are experiencing this problem as well. Here below I will try to explain some of the possible reasons.

似乎未找到/拒绝您的 file:/home/opc/TestApp.jar .这可能是因为找不到目录(访问被拒绝找不到).这可能是因为并非所有节点上都存在,并且Spark提交处于群集模式.如应用程序jar的Spark文档中所述 Spark文档.Application-jar:包含您的应用程序和所有依赖项的捆绑jar的路径.该URL必须在群集内部全局可见,例如,所有节点上都存在hdfs://路径或file://路径.

It looks like your file:/home/opc/TestApp.jar is not found/denied. This might be because of the directory cannot be found (access denied,cannot find). This is likely because it is not there on all nodes and the Spark submit is in cluster mode. As noted in the Spark documentation for application jar Spark documentation. Application-jar: Path to a bundled jar including your application and all dependencies. The URL must be globally visible inside of your cluster, for instance, an hdfs:// path or a file:// path that is present on all nodes.

要解决这一建议,我可以做的就是使用 spark-submit 执行命令.有关 spark-submit 的更多信息,请参见火花提交 Jacek Laskowski

To solve this one of the recommendations I can do is to execute the command using spark-submit. More information about spark-submit can be found at Spark submit and a book by Jacek Laskowski

spark-submit --status [submission ID] --master [spark://...]

这篇关于Spark REST API难以理解,目标是从网页发送RESTful消息的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆